GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



January 1, 2 004, 16:44:17 ; Search time 21 Seconds 

(without alignments) 
678.989 Million cell updates/sec 

US-10-088-872-2 
1704 

1 MKKMPLFSKSHKNPAEIVKI . . FADEKNYLIKQIRDLKKTAP 337 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 328717 seqs, 42310858 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



328717 



Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : 



Issued_Patents_AA: * 

1 : /cgn2_6/ptodata/l/iaa/5A_C 
2 : /cgn2_6/ptodata/l/iaa/5B_( 



COMB . pep 
COMB . pep 



/cgn2_6/ptodata/l/ii 
/cgn2__6/ptodata/l/ic 
/cgn2_6/ptodata/l/iaa/6B_COMB.pep: * 
/cgn2_6 /p toda ta / 1 / iaa/PCTUS_COMB .pep : * 
/ cgn2_6/ptodata/l/iaa/backf ilesl .pep : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 

US-09-190-965-1 

; Sequence 1, Application US/09190965 

; Patent No. 6071721 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl j. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/09/190,965 
; CURRENT FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 
; SOFTWARE: PERL Program 
; SEQ ID NO 1 

LENGTH: 337 

TYPE : PRT 



ORGANISM: Homo sapiens 
FEATURE: - 

OTHER INFORMATION: 3734805 
US-09-190-965-1 



Query Match 100.0%; Score 1704; DB 3; Length 337; 

Best Local Similarity 100.0%; Pred, No. 1.6e-161; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0 
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RESULT 2 

US-09-470-253-1 

; Sequence 1, Application US/ 094 7 02 53 

; Patent No. 6365371 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C, 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-0635 US 
; CURRENT APPLICATION NUMBER: US/09/470 , 253 
; CURRENT FILING DATE: 1999-12-22 
; PRIOR APPLICATION NUMBER: 09/190,965 
; PRIOR FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 
; SOFTWARE: PERL Program 
; SEQ ID NO 1 

LENGTH: 337 

TYPE : PRT 

ORGANISM: Homo sapiens 
FEATURE : - 

OTHER INFORMATION: 3734805 



US-09-470-253-1 



Query Match 100.0%; Score 1704; DB 4; Length 337; 

Best Local Similarity 100.0%; Pred. No. 1.6e-161; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
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RESULT 3 
US~09-190-965-3 

; Sequence 3, Application US/09190965 

; Patent No. 6071721 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/09/190 , 965 

CURRENT FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS: 5 
; SOFTWARE: PERL Program 
; SEQ ID NO 3 

LENGTH: 341 

TYPE: PRT 

ORGANISM: Mus sp . 

FEATURE: - 

OTHER INFORMATION: g262934 
US-09-190-965-3 

Query Match 8 0.8%; Score 1376; DB 3; Length 341; 

Best Local Similarity 80.7%; Pred. No. 7.3e-129; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps 2; 



Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I lllhlhlll ||:::|:|||| I I i : I I : I I I I I : I | | | | | | | | | | | 
1 MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

.^^ II ihllllllllllllll lllllllilllllhllllll 

61 EPQTEAVAQLAQELYNSGLLGTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 120 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

HllllllllhhIII llllllllllllllllllhl II llhllhlllll 
Db 121 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 180 

Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

lllllllllllllllhl hlllhll I : IIMIIIIMIIIIhlll 

Db 181 ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

ill lllilllllllllllllllllll II III IMIIIII I hh I I Mh:| III II 
Db 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQT 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

lllllll II H|:|||| III Ihlllhlh I 
Db 301 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRNLKRAA 337 

RESULT 4 

US-09-470-253-3 

Sequence 3, Application US/09470253 
Patent No. 6365371 
GENERAL INFORMATION: 
APPLICANT: Tang, Y. Tom 
APPLICANT: Guegler, Karl J. 
APPLICANT: Corley, Neil C. 
APPLICANT: Gorgone, Gina A. 

TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
FILE REFERENCE: PF-0635 US 

CURRENT APPLICATION NUMBER: US/09/470,253 
CURRENT FILING DATE: 1999-12-22 
PRIOR APPLICATION NUMBER: 09/190,965 
PRIOR FILING DATE: 1998-11-13 
NUMBER OF SEQ ID NOS : 5 
SOFTWARE: PERL Program 
SEQ ID NO 3 
LENGTH: 341 
TYPE : PRT 
ORGANISM: Mus sp. 
FEATURE : - 

OTHER INFORMATION: g262934 
US-09-470-253-3 

Query Match 80.8%; Score 1376; DB 4; Length 341; 

Best Local Similarity 80,7%; Pred, No, 7,3e-129; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps 2; 

Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DKKTDKASEEVSKSLQAMKEILCGTNEK 59 
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RESULT 5 

US-09-190- 
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4, 


-4 

Application US/09190965 





Patent No. 6071721 



; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 
; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/09/190 , 965 
; CURRENT FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 
; SOFTWARE: PERL Program 
; SEQ ID NO 4 

LENGTH: 339 

TYPE : PRT 

ORGANISM: Drosophila melanogaster 
FEATURE : - 

OTHER INFORMATION: gl794137 
US-09-190-965-4 

Query Match 65.1%; Score 1109; DB 3; Length 339; 

Best Local Similarity 65.0%; Pred. No. 2.8e-102; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 3; 

Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

INI M hi hll Ih : II hi :|| hllhl ::| :| j::: III 
1 MPLFGKSQKSPVELVKSLKEAINALEAGDRKVEKAQEDVSKNLVSIKNMLHGSSDAEPPA 60 

Qy 64 E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 122 

Mlhlllhl Ih II :| Illllll I lllhllllllNIIIIIII 

61 DYWAQLSQELYNSNLLLLLIQNLHRIDFEGKKHVALIFNNLLRRQIGTRSPTVEYICTK 120 
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RESULT 6 

US-09-470-253-4 

; Sequence 4, Application US/09470253 

; Patent No. 6365371 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/09/470,253 
; CURRENT FILING DATE: 1999-12-22 
; PRIOR APPLICATION NUMBER: 09/190,965 
PRIOR FILING DATE: 1998-11-13 
NUMBER OF SEQ ID NOS : 5 
; SOFTWARE: PERL Program 
; SEQ ID NO 4 

LENGTH: 339 
TYPE : PRT 

ORGANISM: Drosophila melanogaster 
FEATURE : - 

OTHER INFORMATION: gl794137 
US-09-470-253-4 

Query Match 65.1%; Score 1109; DB 4; Length 339; 

Best Local Similarity 65,0%; Pred. No. 2.8e-102; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 3; 
Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 
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241 


Qy 


300 


Db 


301 



121 PEILFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFKFFRYVEVSTFDIA 180 

SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQSENYVTKRQSLKLLGELILDR 23 9 
llll-'llhlllllhl |:||: III I : |::|| lllllhlllllllllhlll 
SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 240 

HNFA I MTKY I S KP ENLKLMMNLLRDKS PN I QFEAFHVFKVF VAS PHKTQP I VE I LLKNQ P 2 9 9 

IN H|:|||:|||lllll|:|::|| II II I I I II I II I II : h I HhHIhIl 
HNFTVMTRYISEPENLKLMMNMLKEKSRNIQFEAFHVFKVFVANPNKPKPILDILLRNQT 300 

KLI EFLSSFQKERTDDEQFADEKNYLI KQI RDLK 333 

lh:M::| :|::|||| ||| lllllj-ll 



RESULT 7 
US-09-190-965-5 

Sequence 5, Application US/09190965 
Patent No. 6071721 
GENERAL INFORMATION: 
APPLICANT: Tang, Y. Tom 
APPLICANT: Guegler, Karl J, 
APPLICANT: Corley, Neil C. 
APPLICANT: Gorgone, Gina A. 

TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
FILE REFERENCE: PF-0635 US 

CURRENT APPLICATION NUMBER: US/09/190,965 
CURRENT FILING DATE: 1998-11-13 
NUMBER OF SEQ ID NOS : 5 
SOFTWARE: PERL Program 
SEQ ID NO 5 
LENGTH: 377 
TYPE: PRT 

ORGANISM: Caenorhabdit is elegans 
FEATURE : - 

OTHER INFORMATION: gl255838 
US-09-190-965-5 



Query Match 62.4%; Score 1063.5; DB 3; Length 377; 

Best Local Similarity 60.5%; Pred. No. l.le-97; 

Matches 211; Conservative 53; Mismatches 68; Indels 17; Gaps 3; 

Qy 4 MP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAM 49 

M N lllhll-ll I- I Ihl III III :||||:: : 

1 MPLLFGKSHKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 

Qy 50 KEILCGTNEKEPPTE---AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR 106 

, I ^ I ^ M :| I I II III: h: : I || | : I I I I | M I I I I : M 

61 KSFIYGNDSAEPSSEHWQVAQLAQEVYNANILPMLIKMLPKFEFECKKDVGQIFNNLLR 120 

Qy 107 RQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

INIMIIIIII: I I II |::|| | ||| |hllll llh l||IM:|: I 
Db 121 RQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRESIRHDHLAKI ILYSDVFY 180 

Qy 167 DFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

II II-' llhlllhtlhl Mil ::|:||: MM I h || hMlhl 
Db 181 TFFLYVQSEVFDI SSDAFSTFKELTTRHKAI I AEFLDSNYDTFFAQYQNLLNSKNYVTRR 240 



227 QSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHK 286 

IN I II I Ihl I II 1 1 IIIIII hlhlllllllll llhllllllllllhhl 

241 QSLKLLGELLLDRHNFNTMTKYISNPDNLRLMMELLRDKSRNIQYEAFHVFKVFVANPNK 300 

Qy 287 TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

HI Nl :|: IhIMM :| I MM I III IIIIII::: I : 
301 PKPISDILNRNREKIiVEFLSEFHNDRTDDEQFNDEKAYLIKQIQEMKSS 349 

RESULT 8 

US-09-470-253-5 

Sequence 5, Application US/09470253 
Patent No. 6365371 
GENERAL INFORMATION: 
APPLICANT: Tang, Y. Tom 
APPLICANT: Guegler, Karl J. 
APPLICANT: Corley, Neil C. 
APPLICANT: Gorgone, Gina A. 

TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
FILE REFERENCE: PF-0635 US 

CURRENT APPLICATION NUMBER: US/09/470 , 253 
CURRENT FILING DATE: 1999-12-22 
PRIOR APPLICATION NUMBER: 09/190,965 
PRIOR FILING DATE: 1998-11-13 
NUMBER OF SEQ ID NOS : 5 
SOFTWARE: PERL Program 
SEQ ID NO 5 
LENGTH: 377 
TYPE : PRT 

ORGANISM: Caenorhabdit is elegans 
FEATURE : - 

OTHER INFORMATION: gl255838 
US-09-470-253-5 

Query Match 62.4%; Score 1063.5; DB 4; Length 377; 

Best Local Similarity 60.5%; Pred. No, l.le-97; 

Matches 211; Conservative 53; Mismatches 68; Indels 17; Gaps 3; 

Qy 4 MP-LFSKSHKNPAEIVKILKDNIAILEK QDKKTDKASEEVSKSLQAM 49 

M II ||||:||::|||:: Mhl Ml III :||||:: : 

1 MPLLFGKSHKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 

Qy 50 KE^L^GTNEKEPPTE---AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR 106 

.1 II : I IIMMhh: :| M I Ml III! |||||:|| 

61 KSFIYGNDSAEPSSEHWQVAQLAQEVYNANILPMLIKMLPKFEFECKKDVGQIFNNLLR 12 0 

Qy 107 RQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

IIIINNMIh I I II hMI I Ml IhllM MM IIMIIM- I 

Db 121 RQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRESIRHDHLAKIILYSDVFY 180 

Qy 167 

Db 181 



Db 



Qy 227 



DFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

II M: IIIMIMMIIM MM :MM|: MM I h II hlllhl 

TFFLYVQSEVFDI SSDAFSTFKELTTRHKAI lAEFLDSNYDTFFAQYQNLLNSKNYVTRR 24 0 
QSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHK 286 

IIMMIIhllllM IIIIII IMhIII IIMII llhllllllllllhhl ■ 



Db 



241 QSLKLLGELLLDRHNFNTMTKYISNPDNLRLMMELLRDKSRNIQYEAFHVFKVFVANPNK 300 



Qy 287 TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

^imnh Ihllll I :||IMil III Illll|:::| : 
Dd 3 01 PKPISDILNRNREKLVEFLSEFHNDRTDDEQFNDEKAYLIKQIQEMKSS 34 9 



RESULT 9 

US-09-914-259-11 

; Sequence 11, Application US/09914259 

; Patent No, 6495336 

; GENERAL INFORMATION: 

; APPLICANT: Makowski, Lee 

; APPLICANT: Hyman, Paul 

; APPLICANT: Williams, Mark 

; TITLE OF INVENTION: STAGED ASSEMBLY OF NANOSTRUCTURES 

FILE REFERENCE: 8471-010-999 
; CURRENT APPLICATION NUMBER: US/09/914,259 
; CURRENT FILING DATE: 2000-11-21 
; NUMBER OF SEQ ID NOS : 18 0 
; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 11 

LENGTH: 3878 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-09-914-259-11 



Qy 

Db 



Query Match 7.5%; Score 128.5; DB 4; Length 3878; 

Best Local Similarity 20.1%; Pred. No. 0.0037; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 

Qy 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAOELYSSG 77 
- Mill I II : I |:|: :: |: ||: || | 
664 IEKLKDNLGIHYKQ--QIDGLQNEMSQKIETMQ FEKDNLI TKQNQLI LE 710 

78 LLVTLIADLQ--LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

' IN |:: : :: jj j |:::| II h : 

711 --ISKLKDLQQSLVNSKSEEMTLQI--NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 766 

126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 
I H I I :: I :| I I 

Db 767 LEKQMKEKE NDLQEKFAQLEAEN-SILKDEKK 797 

186 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

I '-\-'\ I I |:::| ::| || |:| ::| I: 

Db 798 TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 857 

Qy 238 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

\'' \'- \ II I : I : : II 

Db 858 QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 917 

Qy 265 KSPNIQFEA--FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEOFAD- 320 

=^ I INI :| U:: h :|:| I ::|: : :: :| 
Db 918 NPTTVKMKSSVFDEDKTFVA---ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 974 

Qy 321 EKNYLIKQIRDLKK 334 

I ::| :::: Ih 



Db 975 SEQLKQKHGEISFLNEEVKSLKQ 997 



RESULT 10 
US-09-724-517-2 

; Sequence 2, Application US/09724517 

; Patent No. 6379941 

; GENERA.L INFORMATION: 

; APPLICANT: Beraud, Christophe 

; APPLICANT: Freedman, Richard 

; TITLE OF INVENTION: No. 6379941el motor proteins and methods for 
; TITLE OF INVENTION: their use 
; FILE REFERENCE: 1031 

; CURRENT APPLICATION NUMBER: US/09/724 , 517 

; CURRENT FILING DATE: 2000-11-27 

; PRIOR APPLICATION NUMBER: US/09/641,8 07 

; PRIOR FILING DATE: 2000-08-17 

; NUMBER OF SEQ ID NOS : 4 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 

LENGTH: 12 7 9 

TYPE : PRT 

ORGANISM: Human 

FEATURE : 

NAME/KEY: VARIANT 
LOCATION: (409) . . . (436) 

OTHER INFORMATION: Xaa = any amino acid 
US-09-724-517-2 



Query Match 6.7%; Score 113.5; DB 4; Length 1279; 

Best Local Similarity 19.3%; Pred. Mo. 0.024; 

Matches 87; Conservative 61; Mismatches 137; Indels 165; Gaps 14; 

23 DNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTL 82 
hi h:| I h hi I ::h -I Ml II 

794 DHLQKLDEQKKWLDEEVEKVLNQRQELEELEADLKKREAIVSKKEALLQE--KSHLENKK 851 

Qy 83 lADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISA 121 

852 LRSSQALNTDSLKISTRL--NLLEQELSEKNVQLQTSTAEEKTKISEQVEVLQKEKDQLQ 909 

Qy 122 HPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFS 162 

hll I :| II : h I h: : j 

Db 910 KRRHDVDEKLKNGRVLSPEEEHVLFQLEEGIEALEAAIE---YRNESIQNRQKSLRASFH 966 

Qy 163 NQFRDFFKYVE LSTFDIASDAFATFKDLLT RHKVLVAD--- 2 00 

II H II :| : I I :: UNI 

967 NLSRGEANVLEKLACLSPVEIRTILFRYFNKWNLREAERKQQLYNEEMKMKVLERDNMV 1026 

Qy 201 FLEQNYDTI FEDYEKLLQS 219 

Db 1027 RELESALDHLKLQCDRRLTLQQKEHEQKMQLLLHHFKEQDGEGIMETFKTYEDKIQQLEK 1086 

Qy 220 ENYVTKRQS LKLLGELILDRHNFAIM TKYISK 251 

U h I : hll I I I |: |: 

Db 1087 DLYFYKKTSRDHKKKLKELVGEAI--RRQLAPSEYQEAGDGVLKPEGGGMLSEELKWASR 1144 



Qy 252 PENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSF--- 308 

ll^^ll 1- : II :||| : | :: |:| || 

Db 1145 PESMKLSG---REREMDSS ASSLRTQPNPQKLWEDIPELPPIHSSLAPP 1190 

Qy 309 QKERTDDEQFADEKNYLIKQIR 33 0 

I III II : I Ih 

Db 1191 SGHMLGNENKTETDDNQFTKSHSRLSSQIQ 1220 



RESULT 11 
US-09-641-807A-2 

; Sequence 2, Application US/09641807A 

; . Patent No. 6440731 

; GENERAL INFORMATION: 

; APPLICANT: Beraud, Christophe 

; APPLICANT: Freedman, Richard 

; TITLE OF INVENTION: No. 6440731el motor proteins and methods for 
; TITLE OF INVENTION: their use 
; FILE REFERENCE: 1031 

; CURRENT APPLICATION NUMBER: US/09/64i , 807A 
; CURRENT FILING DATE: 2000-08-17 
; NUMBER OF SEQ ID NOS : 4 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 

LENGTH: 1279 

TYPE: PRT 

ORGANISM: Human 

FEATURE : 

NAME/KEY: VARIANT 
LOCATION: (409) . . . (446) 
OTHER INFORMATION: Xaa = any amino acid 
US-09-641-807A-2 



Query Match 6.7%; Score 113.5; DB 4; Length 1279; 

Best Local Similarity 19.3%; Pred, No. 0.024; 

Matches 87; Conservative 61; Mismatches 137; Indels 165; Gaps 14; 

Qy 23 DNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTL 82 

hi l-ll h hi I ::|: ::| Ml II 

794 DHLQKLDEQKKWLDEEVEKVLNQRQELEELEADLKKREAIVSKKEALLQE--KSHLENKK 851 

Qy 83 lADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISA 121 

: I - : I h: hi :: :: :| 

Db 852 LRSSQALNTDSLKISTRL--NLLEQELSEKNVQLQTSTAEEKTKISEQVEVLQKEKDQLQ 909 

Qy 122 HPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFS 162 

IHI I :| 11 : h I h: : | 

Db 910 KRRHDVDEKLKNGRVLSPEEEHVLFQLEEGIEALEAAIE---YRNESIQNRQKSLRASFH 966 

Qy 163 NQFRDFFKYVE LSTFDIASDAFATFKDLLT RHKVLVAD--- 200 

II -A II :| : I I :: : ||| | 

Db 967 NLSRGEANVLEKLACLSPVEIRTILFRYFNKWNLREAERKQQLYNEEMKMKVLERDNMV 1026 

Qy 201 FLEQNYDTI FEDYEKLLQS 219 

Db 1027 RELESALDHLKLQCDRRLTLQQKEHEQKMQLLLHHFKEQDGEGIMETFKTYEDKIQQLEK 1086 



Qy 220 ENYVTKRQS LKLLGELILDRHNFAIM TKYISK 251 

M h I :|:|| III j: |: 

Db 1087 DLYFYKKTSRDHKKKLKELVGEAI--RRQLAPSEYQEAGDGVLKPEGGGMLSEELKWASR 1144 

Qy 252 PENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSF--- 308 

11-11 1- : II :|ll : I - hi || 

Db 1145 PESMKLSG---REREMDSS ASSLRTQPNPQKLWEDIPELPPIHSSLAPP 1190 

Qy 309 QKERTDDEQFADEKNYLIKQIR 330 

I III II : I ||: 

Db 1191 SGHMLGNENKTETDDNQFTKSHSRLSSQIQ 1220 

RESULT 12 
US-09-723-096-2 

Sequence 2, Application US/09723096 
Patent No. 6448026 
GENERAL INFORMATION: 
APPLICANT: Beraud, Christophe 
APPLICANT: Freedman, Richard 

TITLE OF INVENTION: No. 6448026el motor proteins and methods for 
TITLE OF INVENTION: their use 
FILE REFERENCE: 1031 

CURRENT APPLICATION NUMBER: US/09/723,096 
CURRENT FILING DATE: 2000-11-27 
PRIOR APPLICATION NUMBER: US/09/641,8 07 
PRIOR FILING DATE: 2000-08-17 
NUMBER OF SEQ ID NOS : 4 

SOFTWARE: Fast SEQ for Windows Version 4.0 
SEQ ID NO 2 
LENGTH: 1279 
TYPE: PRT 
ORGANISM: Human 
FEATURE : 

NAME /KEY: VARIANT 
LOCATION: (409) . . . (436) 

OTHER INFORMATION: Xaa = any amino acid 
US-09-723-096-2 

Query Match 6.7%; Score 113.5; DB 4; Length 1279; 

Best Local Similarity 19.3%; Pred. No. 0.024; 

Matches 87; Conservative 61; Mismatches 137; Indels 165; Gaps 14; 

23 DNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTL 82 
in 1-1 I |: |:| I ::|: ::| | || | | 

794 DHLQKLDEQKKWLDEEVEKVLNQRQELEELEADLKKREAIVSKKEALLQE--KSHLENKK 851 

Qy 83 lADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISA 121 

■ I - : I |:: hi :: :: :| 

8 52 LRSSQALNTDSLKISTRL--NLLEQELSEKNVQLQTSTAEEKTKISEQVEVLQKEKDQLQ 909 

Qy 122 HPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFS 162 

IHI I :| II : h I |:: : j 

Db 910 KRRHDVDEKLKNGRVLSPEEEHVLFQLEEGIEALEAAIE---YRNESIQNRQKSLRASFH 966 

Qy 163 NQFRDFFKYVE LSTFDIASDAFATFKDLLT RHKVLVAD--- 200 

II :| II :| : I I :: : ||| | 



Db 



967 NLSRGEANVLEKIACLSPVEIRTILFRYFNKVWLREAERKQQLYNEEMKMKVLERDNMV 1026 



Qy 



2 01 FLEQNYDTI FEDYEKLLQS 219 

I Ih : I :|| : h 

1027 RELESALDHLKLQCDRRLTLQQKEHEQKMQLLLHHFKEQDGEGIMETFKTYEDKIQQLEK 1086 



Db 



Qy 



220 ENYVTKRQS LKLLGELILDRHNFAIM TKYISK 251 

M h I :|:|| III j: j: 

1087 DLYFYKKTSRDHKKKLKELVGEAI--RRQLAPSEYQEAGDGVLKPEGGGMLSEELKWASR 1144 



Db 



Qy 



252 PENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSF--- 308 
M-ll h: : II :||l : I - hi I| 

1145 PESMKLSG---REREMDSS ASSLRTQPNPQKLWEDI PELPPIHSSLAPP 1190 



Db 



Qy 



309 QKERTDDEQFADEKNYLIKQIR 330 



Db 



1191 SGHMLGNENKTETDDNQFTKSHSRLSSQIQ 122 




RESULT 13 
US-09-417-485D-6 

; Sequence 6, Application US/09417485D 

; Patent No. 6541202 

; GENERAL INFORMATION: 

; APPLICANT: Long, David M. 

; APPLICANT: Metz, Anneke M. 

; APPLICANT: Love, Ruschelle A. 

; TITLE OF INVENTION: Telomerase Reverse Transcriptase (TERT) Genes 

; FILE REFERENCE: 47714 -5009 -US 

; CURRENT APPLICATION NUMBER: US/09/417 , 485D 

; CURRENT FILING DATE: 2002-06-14 

; NUMBER OF SEQ ID NOS : 49 

; SOFTWARE: Patentin Ver. 2.1 

; SEQ ID NO 6 

LENGTH: 2184 

TYPE : PRT 

ORGANISM: Plasmodium falciparum 
FEATURE : 

NAME/KEY: unsure 
LOCATION: (330) . . (335) 

OTHER INFORMATION: Xaa at position 330 = Leu or He; 
OTHER INFORMATION: Xaa at position 335 = Asp or Gly. 
US-09-417-485D-6 

Query Match 6.6%; Score 113; DB 4; Length 2184; 

Best Local Similarity 21,9%; Pred. No. 0.057; 

Matches 77; Conservative 58; Mismatches 140; Indels 76; Gaps 17; 
Qy 1 MKKMPLFSKSHKNPAEIV--KILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNE 58 



Db 



I I • I • I I I • • • • I • • M I I I : I 

309 LPEIDFFSEDRKEKSSSVGYDXKKKNXSNIKRFHNKINRTKEEKKKKWN- -KI I INRNNI 366 



Qy 



59 KEPPTEAVAQLAQELYSSGLLVTLIAD LQLIDFEGKKDVTQI FNN 103 

^1 |:: :: I : |::: Ml -I 

367 LQHNT--TNKCKTFLFNKHIIFDKIENNNIPLFIYDLLNYIFKSDQTYFYHNNFIDEYKQ 424 



Db 



Qy 



104 ILRRQI - -GTRSPTVEYI - -SAHPHILFMLLK- - -GYEAPQIALRCGIMLRECIRHEPLA 156 



••II I • - • I I ' M • I I'll • - I : 

Db 425 KICKQIKCSTKKNDISHI ITSRKENHLFHVQKLENNYKHPNI NKQLRKTKIL 476 

Qy 157 KIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTR-HKV L 197 

I : I l-l I : I I I :| : II: 

Db 477 KYVY--NYFKEFINNVINTKFGKIYRKFFPRKHILNKIHKIFKIIRLQIIKKYRIINLRM 534 

Qy 198 VADFLEQN-YDTIFEDYE KLLQSENYVTKR-QSLKLLGELILDRHNFAIMT 246 

l-l III |:: |:||: HIM :|M | 

535 NRKFIKQKVYDTFFKNYDFLSFSFKTYKIINFMVYITKKCIPIKLLG SKHNFKIFL 590 

Qy 247 KYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKN 2 97 

1^1 1^ i I I : I :| I Mill 

Db 591 KNVKK FLLFNYKESFSLNQVMKNIKVKNIFQKKISKYNIKNRILLKN 637 



RESULT 14 
US-08-630-822A-70 

; Sequence 70, Application US/08630822A 

; Patent No. 5840695 

GENERAL INFORMATION: 

APPLICANT: FRANK, GLENN R. 
APPLICANT: HUNTER, SHIRLEY WU 
APPLICANT: WALLENFELS, LYNDA 

TITLE OF INVENTION: NOVEL ECTOPARASITE SALIVA PROTEINS 
TITLE OF INVENTION: AND APPARATUS TO COLLECT SUCH PROTEINS 
NUMBER OF SEQUENCES: 107 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Sheridan Ross P,C. 
STREET: 1700 Lincoln Street, Suite 3500 
CITY: Denver 
STATE: Colorado 
COUNTRY: U.S. A, 
ZIP: 80203 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: PatentIn Release #1.0, Version #1,25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/630 , 822A 
FILING DATE: ll-APR-1996 
CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 
NAME: CONNELL, GARY J. 
REGISTRATION NUMBER: 32,020 
REFERENCE/DOCKET NUMBER: 2618-17-C3 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (303) 863-9700 
TELEFAX: (303) 863-0223 
; INFORMATION FOR SEQ ID NO: 70: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 586 amino acids 
; TYPE: amino acid 

; TOPOLOGY: linear 

; MOLECULE TYPE: protein 
FEATURE : 



NAME/KEY: Xaa = any amino acid 
LOCATION: 379 
US-08-630-822A-70 



Query Match 6.2%; Score 105; DB 2; Length 586; 

Best Local Similarity 20.0%; Pred. No, 0.054; 

Matches 77; Conservative 54; Mismatches 136; Indels 118; Gaps 15 

Qy 22 KDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVT 81 

I : :| :: |: | || : |||| 

Db 205 KTKIEVIKEEERKIREERQEAREEEEQRKQAELALNASSAAAEASS--AQEL 254 

Qy 82 LIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALR 141 

II nil I M II II :| I ||: 

255 LIDTAPVIDAEKTPKV ATSP-VESPLAPPEVLIM GAPK 291 

Qy 142 CGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADF 201 

Db 292 TPVATEVDKNADEVEFTK-KDLEWEDALDTLSKDKNNLVIEKEVIKDI 339 

Qy 202 LEQ NYDTIFEDYEKL-- 216 

h ||: :: | 

Db 340 KEEIADYQEDVEELKEAIVAAEKPKDEIKETKGAQRLLKXVNKMITKMDTWQEIESKES 399 

Qy 217 LQSENYVTKRQSL---KLLGELILDRHNFAI-MTKYISKPENLKLMMNLL-- 262 

h: |: I I I |1:::| || | : \\\: |:| 

Db 400 EKKAKTLPLEAPRSATETQELDVRKERGEILIDELMDAIKKVKNVPDENRLKLIENILGR 459 

Qy 263 --RDKSPNIQFEAFHVFKVF VASPHKTQPI VEILLKNQPKLIEFLSSFQKER 312 

II :h I I II : I : :|::| | : :|| : | : 

Db 460 IDTDKDRHIKVE--DVLKVIDIVEKEDGIMSTKQLDELVQLLKKEE--VIELEEKKEKQE 515 

Qy 313 TDDEQFADEKNYLIKQIRDLKKTAP 337 

^ U I : III 

Db 516 SQQKSFVPPSETLHLESSQQKSTVP 54 0 



RESULT 15 

US-09-005-069-70 

; Sequence 70, Application US/09005069 

; Patent No. 5932470 

; GENERAL INFORMATION: 

APPLICANT: FRANK, GLENN R. 

APPLICANT: HUNTER, SHIRLEY WU 

APPLICANT: WALLENFELS, LYNDA 

TITLE OF INVENTION: NOVEL ECTOPARASITE SALIVA PROTEINS 
TITLE OF INVENTION: AND APPARATUS TO COLLECT SUCH PROTEINS 
NUMBER OF SEQUENCES: 107 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Sheridan Ross P.C. 

STREET: 1700 Lincoln Street, Suite 3500 

CITY: Denver 

STATE: Colorado 

COUNTRY: U.S.A. 

ZIP: 80203 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 



COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentin Release #1.0, Version #1.25 
•CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/ 005 , 069 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/630,822 

FILING DATE: ll-APR-1996 
ATTORNEY/ AGENT INFORMATION: 

NAME: CONNELL, GARY J. 

REGISTRATION NUMBER: 32,020 

REFERENCE/DOCKET NUMBER: 2618-17- C3 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (303) 863-9700 

TELEFAX: (303) 863-0223 
INFORMATION FOR SEQ ID NO: 70: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 58 6 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
FEATURE : 

NAME/KEY: Xaa = any amino acid 
LOCATION: 379 
US-09-005-069-70 

Query Match 6.2%; Score 105; DB 2; Length 586; 

Best Local Similarity 20.0%; Pred. No. 0.054; 

Matches 77; Conservative 54; Mismatches 136; Indels 118; Gaps 15 
Qy 22 KDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVT 81 



Db 



I I ■ -I • • I • I M • MM 

205 KTKIEVIKEEERKIREERQEAREEEEQRKQAELALNASSAAAEASS- -AQEL 



254 



Qy 



82 LIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALR 141 

II HI II II II M :M Ih 
255 LIDTAPVIDAEKTPKV ATSP-VESPLAPPEVLIM GAPK 291 



Db 



Qy 



142 CGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADF 201 



Db 




Qy 



2 02 LEQ 



NYDTIFEDYEKL-- 216 



Db 



340 KEEIADYQEDVEELKEAIVAAEKPKDEIKETKGAQRLLKXVNKMITKMDTWQEIESKES 399 



Qy 



217 LQSENYVTKRQSL---KLLGELILDRHNFAI-MTKYISKPENLKLMMNLL-- 262 

I- h I I I ||:::| || | : Mh hi 

400 EKKAKTLPLEAPRSATETQELDVRKERGEILIDELMDAIKKVKNVPDENRLKLIENILGR 459 



Db 



Qy 



263 --RDKSPNIQFEAFHVFKVF VASPHKTQPIVEILLKNQPKLIEFLSSFQKER 312 

II :h I III : I : :|::| I : :|| :|: 

460 IDTDKDRHIKVE--DVLKVIDIVEKEDGIMSTKQLDELVQLLKKEE--VIELEEKKEKQE 515 



Db 



Qy 



313 TDDEQFADEKNYLIKQIRDLKKTAP 337 



Db 516 SQQKSFVPPSETLHLESSQQKSTVP 540 



Search completed: January 7, 2004, 16:45:03 
Job time : 2 9 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



January 7, 2004, 16:44:17 / Search time 44 Seconds 

(without alignments) 
1215.701 Million cell updates/sec 

US-10-088-872-2 
1704 

1 MKKMPLFSKSHKNPAEIVKI FADEKNYLIKQIRDLKKTAP 337 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



1107863 seqs, 158726573 residues 



Total number of hits satisfying chosen parameters: 1107863 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : A_Geneseq_19Jiin03 : * 

1 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1980.DAT: * 

2 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1981 . DAT: * 

3 : / S I DS 1 / gcgda t a / geneseq/genes eqp - embl /AAl 982. DAT : * 

4 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1983.DAT:* 

5: /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1984 .DAT:* 

6 : / S I DS 1 / gcgda t a / genes eq/genes eqp - embl /AAl 985. DAT : * 

7 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1986.DAT:* 

8 : / S I DS 1 /gcgda t a /gene s eq/genes eqp - embl /AAl 987. DAT : * 

9 : / SIDSl/gcgdata/geneseq/geneseqp-embl/AA1988 . DAT : * 
10 : /SIDSl/gcgdata/geneseq/geneseqp~embl/AA1989 . DAT: * 
11 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1990 .DAT: * 
12 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1991 .DAT: * 
13 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1992 .DAT: * 
14 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1993 .DAT: * 
15 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1994 .DAT: * 
16: /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1995 .DAT: * 
17 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1996 .DAT: * 
18 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1997 .DAT: * 
19 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1998 .DAT: * 
20: /SIDSl/gcgdata/geneseq/geneseqp-embl/AA1999 .DAT: * 

21: /SIDSl/gcgdata/geneseq/geneseqp-embl/AA2000.DAT: * 

22 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA2 001 .DAT: * 

23 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA2002 .DAT: * 

24 : /SIDSl/gcgdata/geneseq/geneseqp-embl/AA2003 .DAT: * 

Pred. No. is the number of results predicted by chance to have a 

score greater than or equal to the score of the result being printed. 



and is derived by analysis of the total score distribution. 

SUMMARIES 

Result Query 



No. 


Score 


Match 


Length DB 


ID 


uescr ipcion 


1 


1704 


100 


.0 


337 


21 


AAY94247 


nuiiiciij. L.C1 ± X U.111 JUJ-iiU 


2 


1704 


100 


. 0 


337 


22 


AAM3 9078 


riuman poiypepciae 


3 


1704 


100 


.0 


337 


22 


AAB82090 


Hnman 2i»^nt~o Monvr^n 
flUlLlcLli /i-L-U-Lfci INcUXUri 


4 


1466 


86 


.0 


289 


22 


AAB94139 


nuutdii piroL.exn sequ 


5 


1381 


81 


. 0 


341 


22 


AAB4 8 970 


numan A]MiL.-or (acu 


6 


1381 


81 


. 0 


496 


22 


AA.E10858 


vjax^-nuuian AiMiL.-ot' 


7 


1381 


81 


. 0 


552 


22 


AAE10859 


-uex/i. - nuuian aj>j i l. - u f 


8 


1376 


80 


.8 


341 


21 


AAY94248 


i*iuubt; L-dxcxuin jjina 


9 


1354 


79 


.5 


354 


22 


ABG23844 


iMuvtrx iiuinan a i ay no 


10 


1297.5 


76 


. 1 


350 


22 


AAB20387 


nunian acuce neuron 


11 


1162 


68 


.2 


237 


22 


AAM40864 


nuiucin poxypepciae 


12 


1111 


65 


.2 


339 


22 


ABB60392 


■LJx ijoupiixxa mexanoy 


13 


1109 


65 


. 1 


339 


21 


AAY94249 


u xo s opn xxa caxci um 


14 


1063 . 5 


62 


.4 


377 


21 


AAY94250 


exeyans yeasc-i 


15 


716. 5 


42 


. 0 


343 


21 


AAG45273 


•"■X ctjjxLnjpfc> xo Liiaxxa 


16 


689.5 


40 


.5 


300 


21 


AAG23886 


ctjjxLiupb X b L-iiaxxa 


17 


685.5 


40 


.2 


400 


21 


AAG51052 


ii-x axjxQops IS cnaxia 


18 


685.5 


40 


.2 


504 


21 


AAG51051 


/^raoxaopsis cnaiia 


19 


685 


40 


,2 


300 


21 


AAG30714 


/\xaxjxaopsxs cnaixa 


20 


685 


40 


.2 


300 


21 


AAG45274 


"X clXJXU,Upb X o UliaXXa 


21 


685 


40 


.2 


305 


21 


AAG30713 


i-ix cijjxuupb X b Liiaxxa 


22 


684 . 5 


40 


.2 


326 


21 


AAG51053 


B.xaijxuopsis unaxia 


23 


675.5 


39 


.6 


290 


21 


AAG23887 


Kraoiaopsis cnaiia 


24 


671.5 


39 


4 


345 


21 


AAG 0 S 0 R 9 


Arabidopsis thalia 


25 


638.5 


37 


5 


320 


21 


AAG05090 


Araoxaopsis nnaixa 


26 


539.5 


31 


7 


213 


21 


AAG23888 


^xduxuup&xfa cnaxia 


27 


533 


31 


3 


213 


21 


AAG30715 


Arajjxaopsis unaixa 


28 


533 


31 


3 


213 


21 


AAG45275 


Arajjxuopsxs Enaiia 


29 


478 . 5 


28 


1 


197 


21 


■CIlI WJ \J ^ \J ^ J- 


Arabidopsis thalia 


30 


467.5 


27, 


4 


154 


21 


AAG4 n S 1 


Zea mays protein f 


31 


453.5 


26. 


6 


148 


21 


AAG41152 


Zea mays protein f 


32 


438.5 


25. 


7 


139 


21 


AAG4 1 1 R ? 

-ii.tl.v_j ~ J_ _L J J 


Zea mays protein f 


33 


250.5 


14 . 


7 


236 


23 


ABP02921 


numan uk^a protexn 


34 


241 


14 . 


1 


639 


22 


ABG25372 


Novel human diacrno 


35 


227 . 5 


13 . 


4 


135 


23 


ABP34081 


Human ORF3054 prot 


36 


226.5 


13. 


3 


383 


22 


ABG23843 


Novel human diagno 


37 


125 


7 . 


3 


660 


22 


ABB30817 


Peptide #3468 enco 


38 


125 


7 . 


3 


660 


23 


ABG38772 


Human peptide enco 


39 


117.5 


6. 


9 


709 


23 


ABG70293 


Human novel polype 


40 


114.5 


6. 


7 


833 


21 


AAB42353 


Human ORFX ORF2117 


41 


113.5 


6. 


7 


1279 


23 


ABG70787 


Human kinesin-rela 


42 


113 . 5 


6. 


7 


1279 


23 


ABB80078 


Human kinesin mo to 


43 


113.5 


6. 


7 


1279 


24 


ABG72397 


Human partial kine 


44 


113 


6. 


6 


2184 


22 


AAE00425 


P. falciparum telo 


45 


111.5 


6, 


5 


725 


18 


AAW39165 


Human RHAMM protei 



ALIGNMENTS 



Human calcium binding protein hCBP. 

Human; calcium binding protein; cancer; inflammation; CBP; 
reproductive disorder; autoimmune disorder; developmental disorder; 
seizure disorder; immune disorder; infection. 



RESULT 1 
AAY94247 

ID AAY94247 standard; protein; 337 AA. 
XX 

AC AAY94247; 

XX 

DT lO-AUG-2000 (first entry) 
XX 
DE 
XX 
KW 
KW 
KW 
XX 

OS Homo sapiens . 
XX 

PN WO200029580-A1. 
XX 

PD 25-MAY-2000. 
XX 

PF 12-NOV-1999; 99WO-US27027 , 
XX 

PR 13-NOV-1998; 98US- 0190965 . 
XX 

PA (INCY-) INCYTE PHARM INC. 
XX 

PI Tang YT, Guegler KJ, Corley NC, Gorgone GA; 
XX 

DR WPI; 2000-387793/33. 
DR N-PSDB; AAA27332 . 
XX 

PT Human hCBP protein, and the nucleic acid encoding it, useful for e.g. 
PT diagnosis, prevention and treatment of cancers, immiane, developmental 
PT or reproductive disorders - 
XX 

PS Claim 1; Fig 1; 72pp; English. 
XX 

CC The present sequence is the human calcium binding protein hCBP. It 

CC was obtained by screening a coronary artery smooth muscle cDNA library, 

CC from which five overlapping nucleic acids were isolated, sequenced and' 

CC expressed to give the protein. The protein and the gene encoding it are 

CC useful for the diagnosis and treatment of the following types of 

CC disorder: cancers (such as adenocarcinomas), reproductive disorders 

CC (such as infertility, ovulatory defects, endometriosis, disruptions of 

CC the oestrus and menstrual cycles, polycystic ovary syndrome and ovarian 

CC hyperstimulation) , autoimmune disorders (such as benign prostatic 

CC hyperplasia and prostatitis) , developmental disorders (such as 

CC Gushing 's syndrome, muscular dystrophy and gonadal dysgenesis), 

CC hereditary neuropathies, seizure disorders, immune disorders (such as 

CC AIDS, allergies, anaemia, asthma, atherosclerosis, cholecystitis, Crohn» 

CC disease, diabetes. Graves' disease, multiple sclerosis, psoriasis, 

CC rheumatoid arthritis, scleroderma, Sjogren's syndrome and ulcerative 

CC colitis), and viral, bacterial, fungal, parasitic, protozoal and 

CC helminthic infections. 

XX 

SQ Sequence 33 7 AA; 



Matches 


33 


Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


UJD 




Qy 


241 


Db 


241 


Qy 


301 


Db 


301 



Query Match 100.0%; Score 1704; DB 21; Length 337; 

Best Local Similarity 100.0%; Pred, No. 1.3e-146; 

7; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

MKKMPLFSKSHKNPAE I VKI LKDNLAI LEKQDKKTDKASEEVSKSLQAMKE I LCGTNEKE 6 0 

MMIMMIIMIMIIIIIIIIMIIIIIIIIIIIIIIIIMIIMIIIIMIIIMI 

MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

IIIMIMIIIIIIIIIIIMIIIIMIillllllllllllllllMMIIIIIIIIIII 

PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 
AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

MIIIIIIIIIMMIIMIIMMIillllMIIIIMIIMIIIIMIIIIIIIIIM 

AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVELSTFDIA 180 

SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

IIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIililllilllllllllMIIIIIIII 

SDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI LDRH 24 0 
NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 3 00 

IIMIIIIIIIIIIIIIIIIilllllllllllllllllllllllllliiiiiiiiiiiii 

NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300 

LIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

IIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

LI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKKTAP 337 

RESULT 2 
AAM39078 

ID AAM39078 Standard; Protein; 337 AA. 
XX 

AC AAM39078; 
XX 

DT 22-OCT-2001 (first entry) 
XX 

DE Human polypeptide SEQ ID NO 2223. 
XX 

KW Human; nootropic; immunosuppressant; cytostatic; gene therapy; cancer; 

KW peripheral nervous system; neuropathy; central nervous system; CNS; 

KW Alzheimer's; Parkinson's disease; Huntington's disease; haemostatic; 

KW amyotrophic lateral sclerosis; Shy-Drager Syndrome; chemotactic; 

KW chemokinetic; thrombolytic; drug screening; arthritis; inflammation; 

KW leukaemia. 

XX 

OS Homo sapiens . 
XX 

PN WO200153312-A1. 
XX 

PD 26-JUL-2001. 
XX 

PF 26-DEC-2000; 2000WO-US34263 . 
XX 

PR 21-JAN-2000; 2000US-0488725 . 

PR 25-APR-2000; 2000US- 0552317 . 

PR 09-JUL-2000; 2000US-0598042 . 



PR 19-JUL-2000; 2000US-0520312 . 

PR 03-AUG-2000; 2 OOOUS-0653450 . 

PR 14-SEP-2000; 2000US-0662191 . 

PR 19-OCT-2000; 2000US-0693036. 

PR 29-NOV-2000; 2000US- 0727344 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Liu Asundi V, Chen R, Ma Y, Qian XB, Ren F, Wang D; 

PI Wang J, Wang Z, Wehrman T, Xu C, Xue AJ, Yang Y, Zhang J; 



PI Zhao QA, Zhou P, Goodrich R, Drmanac RT; 
XX 

DR WPI; 2001-442253/47. 

DR N-PSDB; AAI58234. 
XX 

PT Novel nucleic acids and polypeptides, useful for treating disorders 

PT such as central nervous system injuries - 

XX 



PS Example 4; SEQ ID NO 2223; 10078pp; English. 

XX 

CC The invention relates to human nucleic acids (AAI57798-AAI61369) and 

CC the encoded polypeptides (AAM38642-AAM42213) with nootropic, 

CC immunosuppressant and cytostatic activity. The polynucleotides are useful 

CC in gene therapy. A composition containing a polypeptide or polynucleotide 

CC of the invention may be used to treat diseases of the peripheral nervous 

CC system, such as peripheral nervous injuries, peripheral neuropathy and 

CC localised neuropathies and central nervous system diseases, such as 

CC Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

CC lateral sclerosis, and Shy-Drager Syndrome. Other uses include the 

CC utilisation of the activities such as: Immune system suppression, 

CC Activin/inhibin activity, chemotactic/chemokinetic activity, haemostatic 

CC and thrombolytic activity, cancer diagnosis and therapy, drug screening, 

CC assays for receptor activity, arthritis and inflammation, leukaemias and 

CC C.N.S disorders. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification. 

XX 

SQ Sequence 337 AA; 



Query Match 100.0%; Score 1704; DB 22; Length 337; 

Best Local Similarity 100.0%; Pred. No. 1.3e-146; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

illilillllllllllllllllllMIMIIIIIIIMIIIIIIIIIIIMMIIIIMI 

Db 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

Qy 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

IIIIIIMIIIillllllllMMIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIM 

Db 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

Qy 121 AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

MIIIIIIMIIIIIilllllllllllllllllllllllllllllllllMIIIIIIIII 

Db 121 AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 24 0 

llllllllllllllllllllillllllllllillllllllllllllllllllllllllll 



S»P.T™.— V^O.L.OH..^IP^DVEK.U>SK«V™CRQSUa^EMI.D.H 

301 LIEFLS 
111 

Db 301 LII 



Db 

Qy 



.SSFQKERTDDEQFADEKNYLIKQIRDLKKTA^ 337 



RESULT 3 

AAB82090 ^ 

ID AAB82090 Standard; Protein; 337 AA. 



XX 

AC AAB82090; 

XX 



26-JUN-2001 (first entry) 



S Hu.a. .cute -.u.cnal .na.cea Calc.u™ Binain^ Protein, «.IC-B.. 
„.»an. cere.rop«tective;^europ.ot.ctiv^ 



XX 
KW 
KW 
. KW 
XX 

OS Homo sapiens. 
XX 

PN VJO200123552-A1. 
XX 

PD 05-APR-2001. 

PF 18-SEP-2000; 2000WO-EP09132 . 

S 24-SEP-1999; 99EP-0118848 . 

y V 

PA (MERE ) MERCK PATENT GMBH. 
XX 

PI Den Daas I, Duecker K; 
XX 

DR WPI; 2001-308142/32. 
DR N-PSDB; AAF86462. 



UK iM-ir.jj^^, 

S ».vel .u^n acute —l,ifZf..r\T..l^^'f'^^^^ 

^P? rcrra1":rrt?pirLrro,is a„a spi„al co.a ^n^u^v - 

fs Clal« 1; Page 41-42; 45pp; English. 

S The present se^ence Is *e prote^ se.uence^-^^ ^^^^^^^^^ 

cc induced calcium Binding Protexn J i,„<j „au».a, multiple 

CC protein are useful for "-"'"^ stroke acu ^^^^^^^^ 

S rrr:rso\re?uf riaSnerrofinrucing an immunological response in 



CC mammal . 
XX 

SQ Sequence 337 AA; 



Query Match 100.0%; Score 1704; DB 22; Length 337; 

Best Local Similarity 100.0%; Pred. No. 1.3e-146; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 
Qy 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

M I I I I I I I I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I I I I M I I I I I M M I I 

1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 
Qy 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

INIMIIIIMIIMMIIMIIIIIIIIMIIIIIIIIIIIIMMIMIIMIIIII 

61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 
Qy 121 AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

IIIINIMIMIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIMIIIIIillllll 

Db 121 AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

IMIIMIIIMIIIIIIIMIIIIIIIIIIIIIIMMllMliMMIIIIIIIN 

Db 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 

Qy 241 NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300 

IIIIIIIIIMIMMMIIIIMIMIIIIIIIIIIIIIIIIIIIIIIMIIIMIIM 

Db 241 NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300 

Qy 3 01 LI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKKTAP 337 

illlllllllllllllllllllllllllllilllill 

Db 3 01 LI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKKTAP 337 

RESULT 4 
AAB94139 

ID AAB94139 Standard; Protein; 289 AA. 
XX 

AC AAB94139; 
XX 

DT 26-JUN-2001 (first entry) 
XX 

DE Human protein sequence SEQ ID NO: 14408. 
XX 
KW 
XX 

OS Homo sapiens. 
XX 

PN EP1074617-A2. 
XX 

PD 07'-FEB-2001. 
XX 

PF 28-JUL-2000; 2000EP- 0 1 16126 . 
XX 

PR 29-JUL-1999; 99 JP- 0248 036 . 
PR 27-AUG-1999; 99JP-0300253 . 
PR 11~JAN-2000; 2000JP-0118776 . 
PR 02-MAY-2000; 2000JP-0183767 . 
PR 09-JUN-2000; 2000JP-0241899 . 
XX 

PA (HELI-) HELIX RES INST. 
XX 



Human; primer; detection; diagnosis; antisense therapy; gene therapy. 



PI Ota T, Isogai T, Nishikawa T, Hayashi K, Saito K, Yamamoto J; 

PI Ishii S, Sugiyama T, Wakamatsu A, Nagai K, Otsuki T; 

XX 

DR WPI; 2001-318749/34. 
XX 

PT Primer sets for synthesizing polynucleotides, particularly the 5602 

PT full-length cDNAs defined in the specification, and for the detection 

PT and/or diagnosis of the abnormality of the proteins encoded by the 

PT full-length cDNAs - 
XX 

PS Claim 8; SEQ ID 14408; 2537pp + CD ROM; English. 
XX 

CC The present invention describes primer sets for synthesising 5602 

CC full-length cDNAs defined in the specification. Where a primer set 

CC comprises: (a) an oligo-dT primer and an oligonucleotide complementary 

CC to the complementary strand of a polynucleotide which comprises one of 

CC the 5602 nucleotide sequences defined in the specification, where the 

CC oligonucleotide comprises at least 15 nucleotides; or (b) a combination 

CC of an oligonucleotide comprising a sequence complementary to the 

CC complementary strand of a polynucleotide which comprises a 5 ' -end 

CC sequence and an oligonucleotide comprising a sequence complementary to a 

CC polynucleotide which comprises a 3* -end sequence, where the 

CC oligonucleotide comprises at least 15 nucleotides and the combination of 

CC the 5 '-end sequence/3 ' -end sequence is selected from those defined in 

CC the specification. The primer sets can be used in antisense therapy and 

CC in gene therapy. The primers are useful for synthesising polynucleotides, 

CC particularly full-length cDNAs. The primers are also useful for the 

CC detection and/or diagnosis of the abnormality of the proteins encoded by 

CC the full-length cDNAs . The primers allow obtaining of the full-length 

CC cDNAs easily without any specialised methods. AAH03166 to AAH13628 and 

CC AAH13633 to AAH18742 represent human cDNA sequences; AAB92446 to 

CC AAB95893 represent human amino acid sequences; and AAH13629 to AAH13632 

CC represent oligonucleotides, all of which are used in the exemplification 

CC of the present invention. 

XX 

SQ Sequence 289 AA; 



Query Match 86.0%; Score 1466; DB 22; Length 289; 

Best Local Similarity 99.7%; Pred. No. 4.6e-125; 

Matches 288; Conservative 0; Mismatches 1; Indels 0; Gaps 0 



Qy 


49 


MKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQ 108 

IIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIMIMIIIIIIIIIIII 

MKEI LCGTNEKEPPTEAVAQLAQELYSSGLLVTLI ADLQLI DFEGKKDVTQI FNNI LRRQ 6 0 


Db 


1 


Qy 


109 


IGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDF 168 

MMMIIIIIIIIMIIIIIMMIIIIIIIIIIIIIIIIIMIII llllllllllll 

IGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLVKIILFSNQFRDF 120 


Db 


61 


Qy 


169 


FKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQS 228 

IMMMIIIMIIIIIIIIIIIIIIIIIIIIIIIMMMIMIIIIIIIIIIIIIMI 

FKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQS 180 


Db 


121 


Qy 


229 


LKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQ 288 

IIIIIIIMIIIMIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIMIIIIIIIIIIII 

LKLLGELI LDRHNFAIMTKYI SKPENLKLMMNLLRDKSPNI QFEAFHVFKVFVASPHKTQ 24 0 


Db 


181 



Qy 

Db 



289 PIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

IIIMIIIIIIIIMMIIIIIIMIIIIIIIIIIMIIIIIIIIIIM 

241 PIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 289 



RESULT 5 
AAB48970 

ID AAB48970 Standard; Protein,- 341 AA. 
XX 

AC AAB48970; 
XX 

DT 27-MAR-2001 (first entry) 

XX 

DE Human ANIC-BP (acute neuronal induced calcium-binding protein) 
XX 
KW 

KW 



Human; acute neuronal induced calcium-binding protein; ANIC-BP; 
Mo25 homologue; HymA homologue; drug screening; stroke; 

KW acute head trauma; multiple sclerosis; spinal cord injury; vaccine; 

KW cerebroprotect ive; neuroprotective . 
XX 

OS Homo sapiens. 
XX 

PN WO200078947-A1. 
XX 

PD 28-DEC-2000. 
XX 

PF 14-JUN-2000; 2000WO-EP05457 . 
XX 

PR 22-JUN-1999; 99EP-0112 024 . 
XX 

PA (MERE ) MERCK PATENT GMBH. 
XX 

PI Den Daas I, Fischer V, Seyfried C, Von Melchner L; 
XX 

DR WPI; 2001-102721/11. 

DR N-PSDB; AAC91772 . 
XX 

PT Novel acute neuronal induced calcium binding protein, useful for 

PT treating acute head trauma, stroke, multiple sclerosis and spinal cord 

PT injury - 

XX 

PS Claim 2; Page 37; 50pp; English. 
XX 

CC The invention relates to human acute neuronal induced calcium-binding 

CC protein (ANIC-BP) and to nucleic acid encoding it. The invention 

CC also relates to expression systems and recombinant host cells comprising 

CC ANIC-BP DNA, the recombinant production of ANIC-BP, antibodies specific 

CC for ANIC-BP, fusion proteins comprising ANIC-BP and an immunoglobulin 

CC Fc region, and methods of screening for modulators of ANIC-BP function. 

CC ANIC-BP has homology and structural similarity to HymA and Mo25 proteins. 

CC ANIC-BP proteins and nucleotides are useful for treating stroke and 

CC acute head trauma, multiple sclerosis and spinal cord injury. ANIC-BP 

CC proteins are useful in screening assays, for identifying membrane bound 

CC or soluble receptors, and also in vaccines. ANIC-BP nucleotides are 

CC useful as diagnostic reagents, as tools for tissue expression studies, 

CC for chromosome localisation studies, as genetic vaccines, and in 

CC the generation of transgenic animals. The present sequence represents 



CC human ANIC-BP. 
XX 

SQ Sequence 341 AA; 

Query Match 81.0%; Score 1381; DB 22 

Best Local Similarity 81.0%; Pred. No. 3.2e-117 
Matches 273; Conservative 31; Mismatches 29 



Length 341; 
Indels 4; Gaps 



Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I lllhlhlll l|:::|:|||| III HhlMlhl llllll Mill 

Db 1 MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

M lllllllllllhllll I hi 1 1 MM 1 1 Mill MMMMMMIhllMM 

Db 61 EPQTEAVAQLAQELYNSGLLSTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 120 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

MIIIIIIMhhIII Mllllllllllllllllhl II llhllhlllM 

Db 121 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 180 

Qy 18 0 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

i llllll MMII II hi hlllhll I Mini llllllllllllllllhlll 

Db 181 ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 24 0 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

III MMMMMMIMMMMM II I M II II II II I h h II M h : II M M 

Db • 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQA 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

IMMM II Mhllll III Ihllllllh I 

Db 301 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRDLKRPA 337 



RESULT 6 
AAE10858 

ID AAE10858 standard; Protein; 496 AA. 
XX 

AC AAE10858; 
XX 

DT 18-DEC~2001 (first entry) 
XX 

DE Gal4 -human ANIC-BP-1 fusion protein. 
XX 

KW Human; acute neuronal induced calcium binding protein type 1 ligand; 

KW ANIC-BP-1; human disease; stroke; head trauma; multiple sclerosis; 

KW Parkinson's disease; Alzheimer's disease; spinal cord injury; vaccine; 

KW gene therapy; fusion protein; Gal4 protein. 

XX 

OS Chimeric - Homo sapiens . 
OS Chimeric - Unidentified. 
XX 

PN WO200170771-A2. 

XX 

PD 27-SEP-2001 . 
XX 

PF 20-MAR-2001; 2001WO-EP03149 . 
XX 



PR 21-MAR-2000; 2000EP-0106110 . 
XX 

PA (MERE ) MERCK PATENT GMBH, 
XX 

PI Den Daas I, Duecker K, Hock B; 

XX 

DR WPI; 2001-607519/69. 
XX 

PT Novel acute neuronal induced calcium binding protein type 1 ligand 

PT polypeptides, useful in the treatment of stroke, head trauma, multiple 

PT sclerosis, Parkinson's disease, Alzheimer »s disease and spinal cord 

PT injury 

XX 

PS Disclosure; Page 42-44; 46pp; English. 
XX 

CC The invention relates to human acute neuronal induced calcium binding 

CC protein type 1 (ANIC-BP-1) ligand polypeptides and polynucleotides. 

CC Sequences of the invention are useful for treating human diseases 

CC including stroke, head trauma, multiple sclerosis, Parkinson's disease, 

CC Alzheimer's disease and spinal cord injury. They are also useful as 

CC vaccines. ANIC-BP-1 ligands are useful for identifying membrane bound 

CC soluble receptors. Polynucleotides of the invention are useful as 

CC diagnostic reagents, for chromosome localization studies, and as 

CC valuable tools for tissue expression studies. They are also useful in 

CC gene therapy. The present sequence is Gal4-human ANIC-BP-1 fusion 

CC protein comprising the Gal4 protein and a C-terminally linked human 

CC ANIC-BP-1 protein. 

XX 

SQ Sequence 4 96 AA; 

Query Match 81.0%; Score 1381; DB 22; Length 496; 

Best Local Similarity 81.0%; Pred. No. 5.2e-117; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps : 

MPL-FSKSHKNPAEIVKILKDNLAILEKQ- - -DKKTDKASEEVSKSLQAMKEILCGTNEK 59 
II I lll|:||:||| l|:::hl||| ||| MhMMhl IMIII lllll 



M lllllllllllhllll Ihllllllllllllll lllllllllllllhllllM 

EPQTEAVAQLAQELYNSGLLSTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 

SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 

Hllllllll|:|:||| IIIIIIIIIIIIMIIIIhl II llhllhlllll 
CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 

ASDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI LDR 
IIMIIIIIIIIIIIhl hlllhll I :||||| lllllillllllllllhlll 
ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 

HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 

III IIIIIIIIMIIIIIIIIIIIII llllllllllllll|:|:|lll|::||MII 
HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQA 

KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 
Illllll II :||:||ll III Ihllllllh I 



Qy 


4 


Db 


156 


Qy 


60 


Db 


216 


Qy 


120 


Db 


276 


Qy 


180 


Db 


336 


Qy 


240 


Db 


396 


Qy 


300 


Db 


456 



RESULT 7 

AAE10859 

ID AAE10859 standard; Protein; 552 AA. 
XX 

AC AAE10859; 
XX 

DT 18-DEC-2001 (first entry) 
XX 

DE LexA-human ANIC-BP-1 fusion protein. 
XX 

KW Human; acute neuronal induced calcium binding protein type 1 ligand; 

KW ANIC-BP-1; human disease; stroke; head trauma; multiple sclerosis; 

KW Parkinson's disease; Alzheimer's disease; spinal cord injury; vaccine; 

KW gene therapy; fusion protein; LexA protein, 

XX 

OS Chimeric - Homo sapiens. 

OS Chimeric - Unidentified. 

XX 

FH Key Location/Qualifiers 

FT Region 1 . .202 

FT /note= "LexA protein" 

FT Region 203 . . 552 

FT /note=s "Human ANIC-BP-1 protein" 

XX 

PN WO200170771-A2, 
XX 

PD 27-SEP-2001. 
XX 

PF 20-MAR-2001; 2001WO-EP03149 . 
XX 

PR 21-MAR-2000; 2000EP-0106110 . 
XX 

PA (MERE ) MERCK PATENT GMBH. 
XX 

PI Den Daas I, Duecker K, Hock B; 
XX 

DR WPI; 2001-607519/69. 
XX 

PT Novel acute neuronal induced calcium binding protein type 1 ligand 

PT polypeptides, useful in the treatment of stroke, head trauma, multiple 

PT sclerosis, Parkinson's disease, Alzheimer's disease and spinal cord 

PT in j ury 

XX 

PS Disclosure; Page 44-46; 45pp; English. 

XX 

CC The invention relates to human acute neuronal induced calcium binding 

CC protein type 1 (ANIC-BP-1) ligand polypeptides and polynucleotides. 

CC Sequences of the invention are useful for treating human diseases 

CC including stroke, head trauma, multiple sclerosis, Parkinson's disease, 

CC Alzheimer's disease and spinal cord injury. They are also useful as 

CC vaccines. ANIC-BP-1 ligands are useful for identifying membrane boxond 

CC soluble receptors. Polynucleotides of the invention are useful as 

CC diagnostic reagents, for chromosome localization studies, and as 

CC valuable tools for tissue expression studies. They are also useful in 

CC gene therapy. The present sequence is LexA-human ANIC-BP-1 fusion 



CC protein comprising the LexA protein and a C- terminally linked human 

CC ANIC-BP-1 protein. 

XX 

SQ Sequence 552 AA; 



Query Match 81.0%; Score 1381; DB 22; Length 552; 

Best Local Similarity 81.0%; Pred. No. 6e-117; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps 2 



Qy 

Db 


4 

212 


MPL-FSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II 1 lllhlhlll lh::|:|||| III :||:|||||:| llllll lllll 
MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 271 


Qy 


60 


EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II lllllllllllhllll IhllllllllMIIII IIIIIIIMIIIIhllllll 

EPQTEAVAQLAQELYNSGLLSTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 331 


Db 


272 


Qy 


120 


SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVELSTFDI 179 

H lllll lllh|:||| llllllllllllllllllhl II llhllhlilll 
CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 391 


Db 


332 


Qy 


180 


ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

Illilllllllllllhl hlllhll 1 :||||| llllllllllllllllhlll 
ASDAFATFICDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 4 51 


Db 


392 


Qy 


240 


HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

III lllllllllllllllllllllll llillllllll|||hhllllh:|||||| 
HNFTI MTKYI SKPENLKLMMNLLRDKSRNI QFEAFHVFKVFVANPNKTQP I LDI LLKNQA 511 


Db 


452 


Qy 


300 


KLI EFLSSFQKERTDDEQFADEKNYLI KQIRDLKKTA 336 

lllllll II :|hllll III Ihllllllh 1 
KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRDLKRPA 548 


Db 


512 



RESULT 8 

AAY94248 

ID AAY94248 standard; protein; 341 AA. 
XX 

AC AAY94248; 
XX 

DT lO-AUG-2000 (first entry) 
XX 

DE Mouse calcium binding protein M025. 
XX 

KW Mouse; calcium binding protein; cancer; inflammation; M025; CBP; 
KW reproductive disorder; autoimmune disorder; developmental disorder; 
KW seizure disorder; immune disorder; infection. 
XX 

OS Mus sp . 
XX 

PN WO200029580-A1. 

XX 

PD 25-MAY-2000. 
XX 

PF 12 -NOV- 199 9; 99WO-US27027 . 
XX 

PR 13-NOV-1998; 98US-0190965 . 



XX 

PA (INCY-) INCYTE PHARM INC. 
XX 

PI Tang YT, Guegler KJ, Cor ley NC, Gorgone GA; 
XX 

DR WPI; 2000-387793/33. 
XX 

PT Human hCBP protein, and the nucleic acid encoding it, useful for e.g. 

PT diagnosis, prevention and treatment of cancers, immune, developmental 

PT or reproductive disorders - 
XX 

PS Disclosure; Page 66-67; 72pp; English. 
XX 

CC The present sequence is the mouse calcium binding protein M025. It 

CC was used in a sequence alignment to identify human calcium binding 

CC protein hCBP. The hCBP protein and the gene encoding it are 

CC useful for the diagnosis and treatment of the following types of 

CC disorder: cancers (such as adenocarcinomas), reproductive disorders 

CC (such as infertility, ovulatory defects, endometriosis, disruptions of 

CC the oestrus and menstrual cycles, polycystic ovary syndrome and ovarian 

CC hyperstimulation) , autoimmune disorders (such as benign prostatic 

CC hyperplasia and prostatitis) , developmental disorders (such as 

CC Gushing' s syndrome, muscular dystrophy and gonadal dysgenesis), 

CC hereditary neuropathies, seizure disorders, immune disorders (such as 

CC AIDS, allergies, anaemia, asthma, atherosclerosis, cholecystitis, Crohn's 

CC disease, diabetes. Graves' disease, multiple sclerosis, psoriasis, 

CC rheumatoid arthritis, scleroderma, Sjogren's syndrome and ulcerative 

CC colitis), and viral, bacterial, fungal, parasitic, protozoal and 

CC helminthic infections. 

XX 

SQ Sequence 341 AA; 

Query Match 80.8%; Score 1376; DB 21; Length 341; 

Best Local Similarity 80.7%; Pred. No, 9e-117; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps 2 
Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DKKTDKASEEVSKSLQAMKEILCGTNEK 59 



Db 




Qy 



60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 



Db 




Qy 



12 0 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 17 9 



Db 




Qy 



180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 



Db 




Qy 



240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 



Db 




Qy 



3 00 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 



IMIIII II Hhllll III Ihlllhlh I 

Db 301 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRNLKRAA 337 

RESULT 9 
ABG23844 

ID ABG23844 standard; Protein; 354 AA. 
XX 

AC ABG23844; 
XX 

DT 18-FEB-2002 (first entry) 
XX 

DE Novel human diagnostic protein #23835. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder. 
XX 

OS Homo sapiens . 
XX 

PN WO200175067-A2. 

XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO~US08631 . 
XX 

PR 31-MAR-2000; 2000US- 0540217 . 

PR 23-AUG-2000; 2000US-0649167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS88031. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity 
XX 

PS Claim 20; SEQ ID No 54203; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and 

CC polypeptide (II) sequences. (I) is useful as hybridisation probes, 

CC polymerase chain reaction (PCR) primers, oligomers, and for chromosome 

CC and gene mapping, and in recombinant production of (II) . The 

CC polynucleotides are also used in diagnostics as expressed sequence tags 

CC for identifying expressed genes. (I) is useful in gene therapy techniques 

CC to restore normal activity of (II) or to treat disease states involving 

CC (II) . (II) is useful for generating antibodies against it, detecting or 

CC quantitating a polypeptide in tissue, as molecular weight markers and as 

CC a food supplement. (II) and its binding partners are useful in medical 

CC imaging of sites expressing (II) . (I) and (II) are useful for treating 

CC disorders involving aberrant protein expression or biological activity. 

CC The polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 



CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. ABG00010-ABG30377 represent novel human 

CC diagnostic amino acid sequences of the invention, 

CC Note: The sequence data for this patent did not appear in the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp . wipo . int/pub/published_pct_sequences . 
XX 

SQ Sequence 354 AA; 



Query Match 79.5%; Score 1354; DB 22 

Best Local Similarity 79.2%; Fred. No. 9.5e-115 
Matches 267; Conservative 33; Mismatches 33 



Length 354; 

Indels 4; Gaps 2 



Qy 

Db 



4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

il I lllhlhlll lh::hllll III :|hlMlhl IIIIM Mill 
14 MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 73 



Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

•'\ III IIMIIIhlllhlhllllllllllllll lllllllllllllhllllll 

Db 74 DPQTEAGAQLAQELYNSGLLITLVADLQUDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 133 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

HlllllllJhhIII IIIIIIMIIIIM I III: I M llhllhlllll 
Db 134 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLGKIILWSEQFYDFFRYVEMSTFDI 193 

Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 23 9 

Illllllll III II hi hlllhl I I :|llll llllllllllllllllh'lll 
Db 194 ASDAFATFKGLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 253 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

III Illllllll lllllllllllll Illllllllll llhhllllh:|||||| 
Db 254 HNFTIMTKYISKPVNLKLMMNLLRDKSRNIQFEAFHVFKAFVANPNKTQPILDILLKNQA 313 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

IMMII II :|hllll III Ihllllllh I 
Db 314 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQI RDLKRPA 350 



RESULT 10 
AAB20387 

ID AAB20387 standard; Protein; 350 AA. 
XX 

AC AAB20387; 

XX 

DT ll-JUN-2001 (first entry) 
XX 

DE Human acute neuronal induced calcium binding protein ANIC-BP-IB. 
XX 

KW Acute neuronal induced calcium binding protein; ANIC-BP-IB; 

KW spice variant; human; stroke; head trauma; Parkinson's disease; 

KW Alzheimer's disease; multiple sclerosis; spinal cord injury; 

KW cerebroprotective; antiparkinsonian; nootropic; neuroprotective; 

KW therapy; diagnosis; vaccine. 

XX 

OS Homo sapiens . 
XX 

PN WO200125423-A1. 



XX 

PD 12-APR-2001. 
XX 

PF 28-SEP-2000; 2000WO-EP09475 . 
XX 

PR 04-OCT-1999; 99EP-0119113 . 
XX 

PA (MERE ) MERCK PATENT GMBH. 
XX 

PI Duecker K, Den Daas I; 
XX 

DR WPI; 2001-266306/27. 

DR N-PSDB; AAF30688 . 
XX 

PT Novel human acute neuronal induced calcium-binding protein like protein 

PT splice variant, useful for treating stroke, acute head trauma, 

PT Parkinson's disease, Alzheimer »s disease multiple sclerosis, spinal 

PT cord injury - 

XX 

PS Claim 2; Page 44-45; 49pp; English. 
XX 

CC The present sequence is that of a novel human acute neuronal induced 

CC calcium binding protein-like protein splice variant, ANIC-NP-IB. 

CC The protein shows homology to other members of the calcium binding 

CC protein family, including ANIC-BP, a protein discovered by mRNA 

CC differential display that is upregulated in a rat model of head 

CC trauma. ANIC-BP and ANIC-BP-IB differ in their C-terminal portions. 

CC The variant protein could serve as a novel drug target. The 

CC invention provides ANIC-BP-IB polynucleotides {see AAF30688) and 

CC polypeptides, expression vectors, host cells and antibodies, as 

CC well as methods for producing the protein and for treating or 

CC preventing disorders associated with expression of the protein by 

CC inhibiting or activating the action of ANIC-BP-IB. Diseases that 

CC may be treated include stroke and acute head trauma, Parkinson's 

CC disease, Alzheimer's disease, multiple sclerosis and spinal cord 

CC injury. The polynucleotides and polypeptides can also be used in 

CC diagnostic assays and in vaccines, and to identify agonists and 

CC antagonists useful for treating conditions associated with 

CC ANIC-BP-IB imbalance. 

XX 

SQ Sequence 350 AA; 



Query Match 76.1%; Score 1297.5; DB 22; Length 350; 

Best Local Similarity 76.0%; Pred. No. 1.3e-109; 

Matches 263; Conservative 32; Mismatches 38; Indels 13; Gaps 3 



Qy 


4 


MPL-FSKSHKNPAEI VKILKDNLAILEKQ- - -DKKTDKASEEVSKSLQAMKEILCGTNEK 
II 1 lllhlhlll lh::hllll III :|hllllhl IIIMI ||||| 
MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 


59 


Db 


1 


60 


Qy 


60 


EPPTEAVAQLAQELYSSGLLVTLI ADLQLI DFEGKKDVTQI FNNI LRRQIGTRSPTVEYI 

II lllllllllllhllll Ihllllllllllllll lllllllllllllhllllll 
EPQTEAVAQLAQELYNSGLLSTLVADLQLI DFEGKKDVAQI FNNI LRRQIGTRTPTVEYI 


119 


Db 


61 


120 


Qy 


120 


SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 

HllllllllhhIII llllllllllllllllllhl II llhllhlllll 
CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKI ILWSEQFYDFFRYVEMSTFDI 


179 


Db 


121 


180 



Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

IIIMIillllllllhl hlllhll I :illll lllllllillllllllhlll 

Db 181 ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

III I I II I II I I I i I M I I I I I M I I I II I I II I II I I I II : I : II II I II II I I 
Db 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQA 300 

Qy 300 KLIEFLSSFQKERTD DEQFADEKNYLIKQIRDLKKTA 336 

lllllll M :||| : : : |||||: | 

Db 301 KLIEFLSKFQNDRTDCMSSSVPTTNSRVDLRVKPRTRGIRDLKRPA 346 

RESULT 11 
AAM40864 

ID AAM40864 standard; Protein; 237 AA. 
XX 

AC AAM4 0864; 

XX 

DT 22-OCT-2001 (first entry) 
XX 

DE Human polypeptide SEQ ID NO 5795. 
XX 

KW Human; nootropic; immunosuppressant; cytostatic; gene therapy; cancer; 

KW peripheral nervous system; neuropathy; central nervous system; CNS; 

KW Alzheimer's; Parkinson's disease; Huntington's disease; haemostatic; 

KW amyotrophic lateral sclerosis; Shy-Drager Syndrome; chemotactic; 

KW chemokinetic; thrombolytic; drug screening; arthritis; inflammation; 

KW leukaemia. 

XX 

OS Homo sapiens. 
XX 

PN WO200153312-A1. 
XX 

PD 26-JUL-2001. 

XX 

PF 26-DEC-2000; 2000WO-US34263 . 
XX 

PR 21-JAN-2000; 2000US-0488725 . 

PR 25-APR-2000; 2000US- 05523 17 . 

PR 09-JUL-2000; 2000US-0598042 , 

PR 19-JUL-2000; 2000US-0620312 . 

PR 03-AUG-2000; 2000US-0653450 . 

PR 14-SEP-2000; 2000US-0662191 . 

PR 19-OCT-2000; 2000US-0693036 . 

PR 29-NOV-2000; 2000US-0727344 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Liu C, Asundi V, Chen R, Ma Y, Qian XB, Ren F, Wang D; 

PI Wang J, Wang Z, Wehrman T, Xu C, Xue AJ, Yang Y, Zhang J; 

PI Zhao QA, Zhou P, Goodrich R, Drmanac RT; 
XX 

DR WPI; 2001-442253/47. 

DR N-PSDB; AAI60020. 
XX 



PT Novel nucleic acids and polypeptides, useful for treating disorders 

PT such as central nervous system injuries - 

XX 

PS Example 2; SEQ ID NO 5795; 10078pp; English. 

XX 

CC The invention relates to human nucleic acids {AAI57798-AAI61369) and 

CC the encoded polypeptides (AAM38642-AAM42213) with nootropic, 

CC immunosuppressant and cytostatic activity. The polynucleotides are useful 

CC in gene therapy, A composition containing a polypeptide or polynucleotide 

CC of the invention may be used to treat diseases of the peripheral nervous 

CC system, such as peripheral nervous injuries, peripheral neuropathy and 

CC localised neuropathies and central nervous system diseases, such as 

CC Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

CC lateral sclerosis, and Shy-Drager Syndrome. Other uses include the 

CC utilisation of the activities such as: Immune system suppression, 

CC Activin/inhibin activity, chemotact ic/chemokinetic activity, haemostatic 

CC and thrombolytic activity, cancer diagnosis and therapy, drug screening, 

CC assays for receptor activity, arthritis and inflammation, leukaemias and 

CC C.N.S disorders. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification. 
XX 

SQ Sequence 237 AA; 



Query Match 68.2%; Score 1162; DB 22 

Best Local Similarity 100.0%; Pred. No. 1.6e-97 
Matches 227; Conservative 0; Mismatches 0 



Length 237; 
Indels 0; Gaps 



Qy 111 TRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFK 170 

iillllllMlilllllllllllMIIIIIIIIIIIIMIIIIIIIMIIIIIIIIIIM 

Db 2 TRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFK 61 

Qy 171 YVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLK 230 

IIIIMIIIilllllMlllillMINMMIIIIIIIIIIIIIIMIIMMIIIIII 

Db 62 YVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLK 121 

Qy 231 LLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPI 290 

illlllllllllMIIIIIIIIIMIIIIIIMIIIIIMIIIIIIIIIIIIIIIIIIII 

Db 122 LLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPI 181 

Qy 291 VEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

MIIIIIIIMIIIIIIIIMIIIIIIIIIIIIIIIMIMIIIIII 

Db 182 VEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 228 



RESULT 12 
ABB60392 

ID ABB60392 standard; Protein; 339 AA. 
XX 

AC ABB60392; 
XX 

DT 26-MAR-2002 (first entry) 
XX 

DE Drosophila melanogaster polypeptide SEQ ID NO 7968. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 
KW pharmaceutical. 



XX 

OS Drosophila melanogaster . 
XX 

PN WO200171042-A2. 
XX 

PD 27-SEP-2001. 
XX 

PF 23-MAR-2001; 2001WO-US09231 . 
XX 

PR 23-]yiAR-2000; 2000US-191637P . 

PR ll-JUL-2000; 2000US-0614 150 . 
XX 

PA (PEKE ) PE CORP NY. 
XX 

PI Venter JC, Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 

DR N-PSDB; ABL04495. 
XX 

PT New isolated nucleic acid detection reagent for detecting 1000 or more 

PT genes from Drosophila and for elucidating cell signalling and cell-cell 

PT interactions - 
XX 

PS Disclosure; SEQ ID NO 7968; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 

CC capable of detecting 1000 or more genes from Drosophila. The invention is 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell -cell interactions in higher eukaryotes for the development of 

CC insecticides, therapeutics and pharmaceutical dmags. The invention 

CC discloses genomic DNA sequences (ABL16176-ABL30511) , expressed DNA 

CC sequences (ABL01840-ABL16175) and the encoded proteins 

CC {ABB57737-ABB72072) . 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp . wipo . int/pub/piiblished_pct_sequences . 

XX 

SQ Sequence 339 AA; 

Query Match 65.2%; Score 1111; DB 22; Length 339; 

Best Local Similarity 65.0%; Pred. No. 1.2e-92; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 3 

Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

II I Ml hi hi I I |: : II hi :|| hllhl ::| :| h:: III 
1 MPLFGKSQKSPVELVKSLKEAINALEAGDRKVEKAQEDVSKNLVSIKNMLYGSSDAEPPA 60 

Qy 64 E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 122 

' IIIIHMhl ||: II :| lllllll I lllhlllllllMMIMI 
61 DYWAQLSQELYNSNLLLLLIQNLHRIDFEGKKHVALIFNNVLRRQIGTRSPTVEYICTK 120 

Qy 123 PHILFMLLKGYE--APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 18 0 

MM h III hill I 111111:1 nihil::! Ihllhilllll 
Dd 121 PEILFTLMAGYEDAHPEIALNSGTMLRECARYEAIiAKIMLHSDEFFKFFRYVEVSTFDIA 180 



Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQSENYVTKRQSLKLLGELILDR 239 

Mlhllhlllllhl |:|h III I : h:|| I II I I h I I II II I II h I I I 



KW 
KW 



181 SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 24 0 

Qy 24 0 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 2 99 

IN I I I I | | | | | | | | | | | : | : | HI -I I hi I 

Db 241 HNFTVMTRYISEPENLKLMMNMLKEKSRNIQFEAFHVFKVFVANPNKPKPILDILLRNQT 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

l|::|h:| H-MII III lll|||::|| 
Db 301 KLVDFLTNFHTDRSEDEQFNDEKAYLIKQIKELK 334 

RESULT 13 
AAY94249 

ID AAY94249 Standard; protein; 339 AA. 
XX 

AC AAY9424 9; 
XX 

DT lO-AUG-2000 (first entry) 
XX 

DE Drosophila calcium binding protein DM025, 

XX 

Drosophila; calcium binding protein; cancer; inflammation; D]yi025; CBP; 
reproductive disorder; autoimmune disorder; developmental disorder; 

KW seizure disorder; immune disorder; infection. 
XX 

OS Drosophila melanogaster . 
XX 

PN WO200029580-A1 . 

XX 

PD 25-MAY-2000. 
XX 

PF 12-NOV-1999; 99WO-US27027 . 
XX 

PR 13-N0V-1998; 98US-0190965 . 
XX 

PA (INCY-) INCYTE PHARM INC. 

XX 

PI Tang YT, Guegler KJ, Cor ley NC, Gorgone GA; 
XX 

DR WPI; 2000-387793/33. 
XX 

PT Human hCBP protein, and the nucleic acid encoding it, useful for e.g. 

PT diagnosis, prevention and treatment of cancers, immune, developmental 

PT or reproductive disorders - 
XX 

PS Disclosure; Page 67-68; 72pp; English. 
XX 

CC The present sequence is the Drosophila calcium binding protein DM025. It 

CC was used in a sequence alignment to identify human calcium binding 

CC protein hCBP. The hCBP protein and the gene encoding it are 

CC useful for the diagnosis and treatment of the following types of 

CC disorder: cancers (such as adenocarcinomas) , reproductive disorders 

CC (such as infertility, ovulatory defects, endometriosis, disruptions of 

CC the oestrus and menstrual cycles, polycystic ovary syndrome and ovarian 

CC hyperstimulation) , autoimmune disorders (such as benign prostatic 

CC hyperplasia and prostatitis), developmental disorders (such as 

CC Gushing 's syndrome, muscular dystrophy and gonadal dysgenesis), 



CC hereditary neuropathies, seizure disorders, immune disorders (such as 

CC AIDS, allergies, anaemia, asthma, atherosclerosis, cholecystitis, Crohn's 

CC disease, diabetes, Graves' disease, multiple sclerosis, psoriasis, 

CC rheumatoid arthritis, scleroderma, Sjogren's syndrome and ulcerative 

CC colitis), and viral, bacterial, fungal, parasitic, protozoal and 

CC helminthic infections. 

XX 

SQ Sequence 339 AA; 

Query Match 65.1%; Score 1109; DB 21; Length 339; 

Best Local Similarity 65.0%; Pred. No. 1.8e-92; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 3 

MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 
MM M hi MM M: : M hi HI hMhl -I :| h:: Ml 
MPLFGKSQKSPVELVKSLKEAINALEAGDRKVEKAQEDVSKNLVSIKNMLHGSSDAEPPA 60 

E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 122 

Mlhlllhl Ih II :| IIIMII I lllhllllllllMIIIII 

DYWAQLSQELYNSNLLLLLIQNLHRIDFEGKKHVALIFNNLLRRQIGTRSPTVEYICTK 120 

PHILFMLLKGYE--APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 
I Ml h III hill I Mill hi Mlhl \::\ Ihllhllllll 
PEILFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFKFFRYVEVSTFDIA 180 

SDAFATFKDLLTRHKVLVADFLEQNYDTI F - EDYEKLLQSENYVTKRQSLKLLGELI LDR 239 

MMMIhlllllhl hlh III I : h:|| MlllhlMllllllhlll 
SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 240 

HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

Ml MhMhIIMIIIIhhMI MMIMIMIMIhhl MhMlhll 

HNFTVMTRYISEPENLKLMMNMLKEKSRNIQFEAFHVFKVFVANPNKPKPILDILLRNQT 300 

KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 
M-lhM :h:|||| III IMIIhMI 



RESULT 14 
AAY94250 

ID AAY94250 standard; protein; 377 AA. 
XX 

AC AAY94250; 

XX 

DT lO-AUG-2000 (first entry) 
XX 

DE C. elegans yeast-like calcium binding protein. 
XX 

KW Calcium binding protein; cancer; inflammation; yeast-like CBP; CBP; 
KW reproductive disorder; autoimmune disorder; developmental disorder; 
KW seizure disorder; immune disorder; infection. 
XX 

OS Caenorhabditis elegans. 
XX 

PN WO200029580-A1. 
XX 

PD 25-MAY-2000. 



Qy 


4 


Db 


1 


Qy 


64 


Db 


61 


Qy 


123 


Db 


121 


Qy 


181 


Db 


181 


Qy 


240 


Db 


241 


Qy 


300 


Db 


301 



XX 

PF 12-NOV-1999; 99WO-US27027 . 
XX 

PR 13-NOV-1998; 98US- 0190965 . 
XX 

PA (INCY-) INCYTE PHARM INC. 
XX 

PI Tang YT, Guegler KJ, Cor ley NC, Gorgone GA; 
XX 

DR WPI; 2000-387793/33. 
XX 

PT Human hCBP protein, and the nucleic acid encoding it, useful for e.g. 

PT diagnosis, prevention and treatment of cancers, immune, developmental 

PT or reproductive disorders - 
XX 

PS Disclosure; Page 68-69; 72pp; English. 
XX 

CC The present sequence is the C. elegans yeast-like CBP. It 

CC was used in a sequence alignment to identify human calcium binding 

CC protein hCBP. The hCBP protein and the gene encoding it are 

CC useful for the diagnosis and treatment of the following types of 

CC disorder: cancers (such as adenocarcinomas), reproductive disorders 

CC (such as infertility, ovulatory defects, endometriosis, disruptions of 

CC the oestrus and menstrual cycles, polycystic ovary syndrome and ovarian 

CC hyperstimulation) , autoimmune disorders (such as benign prostatic 

CC hyperplasia and prostatitis) , developmental disorders (such as 

CC Gushing »s syndrome, muscular dystrophy and gonadal dysgenesis), 

CC hereditary neuropathies, seizure disorders, immune disorders (such as 

CC AIDS, allergies, anaemia, asthma, atherosclerosis, cholecystitis, Crohn's 

CC disease, diabetes. Graves' disease, multiple sclerosis, psoriasis, 

CC rheumatoid arthritis, scleroderma, Sjogren's syndrome and ulcerative 

CC colitis), and viral, bacterial, fungal, parasitic, protozoal and 

CC helminthic infections. 

XX 

SQ Sequence 377 AA; 

Query Match 62.4%; Score 1063.5; DB 21; Length 377; 

Best Local Similarity 60.5%; Pred. No. 2.8e-88; 

Matches 211; Conservative 53; Mismatches 68; Indels 17; Gaps 3 

Qy 4 MP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAM 49 

^ II II lllhll-ll h: I Ihl Ml Ml Mill:: : 

1 MPLLFGKSHKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 

Qy 50 KEILCGTNEKEPPTE---AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR 106 

I ^ I = M n IIMIIhl:: :| || | :M MM MM MM 
61 KSFIYGNDSAEPSSEHWQVAQLAQEVYNANILPMLIKMLPKFEFECKKDVGQIFNNLLR 12 0 

Qy 107 RQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

, MMIMMIM: I I M h:M I Ml IhMII MM IMMhM I 

121 RQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRESIRHDHLAKIILYSDVFY 180 

Qy DFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

Nil: llhlMhMhl MM ::|:M: MM I h II MM Ihl 
181 TFFLYVQSEVFDI SSDAFSTFKELTTRHKAI I AEFLDSNYDTFFAOYQNLLNSKNYVTRR 240 



Qy 



227 QSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHK 286 



Mllllllhllllll llllll hlhlll llllll llhllllllllllhhl 

Db 241 QSLKLLX3ELLLDRHNFNTMTKYISNPDNLRLMMELLRDKSRNIQYEAFHVFKVFVANPNK 3 

Qy 287 TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

■■W --W =h Ihllll I :||lllll III lllllh::| : 
Db 301 PKPISDILNRNREKLVEFLSEFHNDRTDDEQFNDEKAYLIKQIQEMKSS 349 



RESULT 15 




AAG45273 




ID 


AAG45273 Standard; Protein; 343 AA. 


XX 






AC 


AAG45273; 




XX 






DT 


18-OCT-2000 


(first entry) 


XX 






DE 


Arabidopsis 


thaliana protein fragment SEQ ID NO: 56816. 


XX 






KW 


Protein identification; signal transduction pathway; metabolic pathway; 


KW 


hybridisation assay; genetic mapping; gene expression control; promoter 


KW 


termination 


sequence. 


XX 






OS 


Arabidopsis 


thaliana. 


XX 






PN 


EP1033405-A2. 


XX 






PD 


06-SEP-2000 




XX 






PF 


25-FEB-2000; 2000EP-0301439 . 


XX 






PR 


25-FEB-1999 


99US-0121825. 


PR 


05-MAR-1999 


99US-0123180. 


PR 


09-iyiAR-1999 


99US-0123548 . 


PR 


23-MAR-1999 


99US-0125788 . 


PR 


25-MAR-1999 


99US-0126264. 


PR 


29-MAR-1999 


99US-0126785 . 


PR 


Ol-APR-1999 


99US~0127462. 


PR 


06-APR-1999 


99US-0128234 . 


PR 


08-APR-1999 


99US-0128714 . 


PR 


16-APR-1999 


99US-0129845. 


PR 


19-APR-1999 


99US-0130077, 


PR 


21-APR-1999 


; 99US-0130449. 


PR 


23-APR-1999 


99US-0130510. 


PR 


23-APR-1999 


99US-0130891 . 


PR 


28-APR-1999 


99US-0131449, 


PR 


30-APR-1999, 


99US-0132048. 


PR 


30-APR-1999, 


99US-0132407. 


PR 


04-MAY-1999, 


99US-0132484. 


PR 


05-]yiAY-1999, 


99US-0132485. 


PR 


06-MAY-1999, 


99US-0132486. 


PR 


06-MAY-1999, 


99US-0132487. 


PR 


07-MAY-1999, 


99US'0132863 . 


PR 


ll-MAY-1999, 


99US-0134256. 


PR 


14-MAY-1999; 


99US-0134218 . 


PR 


14-iyiAY-1999; 


99US-0134219. 


PR 


14-MAY-1999; 


99US-0134221. 


PR 


14-MAY-1999; 


99US-0134370. 
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1999, 
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1999, 


PR 


19- 


-JUL- 


1999, 



99US-0134768 

99US-0134941 

99US-0135124 

99US-0135353 

99US-0135629 

99US-0136021 

99US-0136392 

99US-0136782 
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Minimum DB seq length: 0 
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Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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44, Appl 



ALIGNMENTS 



RESULT 1 
US-10-025-730-1 

; Sequence 1, Application US/10025730 
; Publication No. US20030045466A1 
; GENERAL INFORMATION: 
; APPLICANT: Tang, Y. Tom 



; APPLICANT: Guegler, Karl J. 
; APPLICANT: Corley, Neil C. 
; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/lO/025, 730 

; CURRENT FILING DATE: 2001-12-18 

; PRIOR APPLICATION NUMBER: US/09/190 , 965 

PRIOR FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 1 

LENGTH: 337 
TYPE: PRT 

ORGANISM: Homo sapiens 
; FEATURE : - 

OTHER INFORMATION: 37348 05 
US-10-025-730-1 



Query Match 100.0%; Score 1704; DB 15; Length 337; 

Best Local Similarity 100.0%; Pred. No. 3.1e-147; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 


1 


MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

IIIIIIIIIIIIIIIIIIIIIIIIIMIIMMIIIIIIIIIIIIIIIIIIIMIMIII 


Db 


1 


MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 


Qy 


61 


PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

llllllllllilllllllllllllllllMIIIIIIIIIIIIIIIIIIIIIIIIIIIMI 

PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 


Db 


61 


Qy 


121 


AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVELSTFDIA 180 


Db 


121 


IIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIMIIIIIIIII 

AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVELSTFDIA 180 


Qy 

Db 


181 
181 


SDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI LDRH 240 

MIIMIIIIIIIIIIIIIIIIIIIIIIIMIIIIIMMIIMIMIIIIIIIIIIIII 

SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 240 


Qy 


241 


NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPI VEILLKNQPK 3 00 

IIIIIIIIIIMIMIIIIIIIMIIIIIIIIIMMIIIIIIIIIMIIIIIIIIIIM 

NFA I MTKY I SKPENLKLMMNLLRDKS PN I QFEAFHVFKVF VAS PHKTQP I VE I LLKNQPK 300 


Db 


241 


Qy 


301 


LI EFLSSFQKERTDDEQFADEKNYLIKQI RDLKKTAP 337 

lllllllilllllllllllllllilllllllllllll 
LI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKKTAP 337 


Db 


301 



RESULT 2 
US-10-239-079-5 

; Sequence 5, Application US/10239079 

; Publication No. US20030148446A1 

; GENERAL INFORMATION: 

; APPLICANT: Merck Patent GmbH 

; TITLE OF INVENTION: ANIC-BPl - 1 igand 

; FILE REFERENCE: ANIC-BP- 1 -ligand 

; CURRENT APPLICATION NUMBER: US/10/239,079 



CURRENT FILING DATE: 2002-09-19 
NUMBER OF SEQ ID NOS : 8 
SOFTWARE: Patent In Ver. 2.1 
SEQ ID NO 5 
LENGTH: 496 
TYPE: PRT 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: Gal4-ANIC-BP-l 
OTHER INFORMATION: fusion protein 
US-10-239-079-5 

Query Match 81.0%; Score 1381; DB 12; Length 4 96; 

Best Local Similarity 81.0%; Pred. No. 1.6e-117; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps 2; 

MPL - FSKSHKNPAE I VKI LKDNLAI LEKQ DKKTDKASEEVSKSLQAMKE I LCGTNEK 5 9 

II I lllhlhIM lh::hllll III :|hllll|:| IIMIi ||||| 
MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 215 

EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

M IMIIIIIIIIhllll IhlllllllllllMI lllllllllllllhllllM 

EPQTEAVAQLAQELYNSGLLSTLVADLQLI DFEGKKDVAQI FNNI LRRQIGTRTPTVEYI 275 

SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

HlllilMihhii! Illlllllllllllllll|:| II llhllhlllll 
CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKI ILWSEQFYDFFRYVEMSTFDI 335 

ASDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

llllll I Mil llllhl hlllhll I :||||| llllllllllllllllhlll 

ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 395 

HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

Ml IMIIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIIMhIIMMMIIIII 

HNFTI MTKY I S KPENLKLMMNLLRDKSRN I QFEAFHVFKVFVANPNKTQP I LDI LLKNQA 455 
KLI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKKTA 336 

Illllll II Uhllll III Ihllllllh I 

KLI EFLSKFQNDRTEDEQFNDEKTYLVKQI RDLKRPA 492 







Db 


156 


Qy 


60 


Db 


216 


Qy 


120 


Db 


276 


Qy 


180 


Db 


336 


Qy 


240 


Db 


396 


Qy 


300 


Db 


456 



RESULT 3 
US-10-239-079-6 

; Sequence 6, Application US/10239079 

; Publication No. US20030148446A1 

; GENERAL INFORMATION: 

; APPLICANT: Merck Patent GmbH 

; TITLE OF INVENTION: ANIC-BPl - 1 igand 

FILE REFERENCE: ANI C-BP- 1 - 1 igand 
; CURRENT APPLICATION NUMBER: US/10/239,079 
; CURRENT FILING DATE: 2002-09-19 
; NUMBER OF SEQ ID NOS: 8 
; SOFTWARE: Patent In Ver. 2.1 
; SEQ ID NO 6 

LENGTH: 552 

TYPE : PRT 

ORGANISM: Artificial Sequence 



FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: LexA-ANIC-BP-1 
OTHER INFORMATION: fusion protein 
US-10-239-079-6 

Query Match 81.0%; Score 1381; DB 12; Length 552; 

Best Local Similarity 81.0%; Pred. No. 1.9e-117; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps 2 

MPL-FSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 5 9 

II I lllhlhlll l|:::hllll III HhlMlhl |||||| ||||| 

MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 271 

EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II lllllllllllhllll Ihlllllllllllill lllllllllllllhllllll 

EPQTEAVAQLAQELYNSGLLSTLVADLQLIDFEGKKDVAQI FNNI LRRQIGTRTPTVEYI 331 

SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVELSTFDI 179 

HllllllllhhIII IIIIIIIIIIIIMIIIIhl li ll|:|||:|ilM 
CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKI ILWSEQFYDFFRYVEMSTFDI 391 

ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

lllllllllllllllhl hlllhll I :||||| llllllllllilllllhlM 
ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 451 

HNFAI MTKYI SKPENLKLMMNLLRDKS PNI QFEAFHVFKVFVASPHKTQPI VEI LLKNQP 299 

III MIIMIIIIIIIIIMIIIIII llllllllllllllhhIlllhHMIII 

HNFTI MTKYI SKPENLKLMMNLLRDKSRN I QFEAFHVFKVF VANPNKTQP I LDI LLKNQA 511 

KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

IMIIII 11 :|hllll III Ihllllllh I 
KLI EFLSKFQNDRTEDEQFNDEKTYLVKQI RDLKRPA 548 



Qy 


4 


Db 


212 


Qy 


60 


Db 


272 


Qy 


120 


Db 


332 


Qy 


180 


Db 


392 


Qy 


240 


Db 


452 


Qy 


300 


Db 


512 



RESULT 4 

US-10-025-730-3 

; Sequence 3, Application US/10025730 

; Publication No. US20030045466A1 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/lO/025 , 730 

; CURRENT FILING DATE: 2001-12-18 

; PRIOR APPLICATION' NUMBER: US/09/190 , 965 

PRIOR FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 
; SOFTWARE: PERL Program 
; SEQ ID NO 3 

LENGTH: 341 

TYPE : PRT 

ORGANISM: Mus sp . 

FEATURE : - 

OTHER INFORMATION: g262934 



US-10-025-730-3 

Query Match 8 0.8%; Score 1376; DB 15; Length 341; 

Best Local Similarity 80.7%; Pred. No. 2.8e-117; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps : 

MPL-FSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I lll|:|hllM|:::hllll I I I : I h I I II h I I II I I I I II II 



I lllllllllllhllM IhlllllllMIIIII lllllllllllllhllllll 
IPQTEAVAQLAQELYNSGLLGTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 

lAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 

UllllllllhhIII llllllllilllllllllhl M llhllhlllll 
:tqqnilfmllkgyespeialncgimlrecirheplakiilwseqfydffryvemstfdi 

.sdafatfkdlltrhkvlvadfleqnydti fedyejcllqsenyvtkrqslkllgeli ldr 
IIIIIIIIIIIMIhl hlllhll I :!!!!! IIMIIIIIMIIIIIhlll 

.SDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 

[NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 

II I I I I I II I I I I I I I I M I I I I II I I I I I M I I M I II h I : I I I I h : I I II M 
[NFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQT 

LI EFLSSFQKERTDDEQFADEKNYLI KQI RDLKKTA 336 
Mill! II :|hllll III Ihlllhlh I 



Qy 


4 


Db 


1 


Qy 


60 


Db 


61 


Qy 


120 






Qy 


180 


Db 


181 


Qy 


240 


Db 


241 


Qy 


300 


Db 


301 



RESULT 5 

US-10-025-730-4 

; Sequence 4, Application US/1002573 0 

; Publication No. US20030045466A1 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 

FILE REFERENCE: PF-0635 US 
; CURRENT APPLICATION NUMBER: US/lO/025 , 730 

CURRENT FILING DATE: 2001-12-18 
; PRIOR APPLICATION NUMBER: US/09/190,965 

PRIOR FILING DATE: 1998-11-13 
; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 4 

LENGTH: 33 9 
TYPE : PRT 

ORGANISM: Drosophila melanogaster 
FEATURE : - 

OTHER INFORMATION: gl794137 
US-10-025-730-4 



Query Match 



65.1%; Score 1109; DB 15; Length 339; 



Best Local Similarity 65.0%; Pred. No. 6.5e-93; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 3; 



Qy 


4 


MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 


Db 


1 


MM M hi MM M: : M Ml Ml IMIIM -1 M h- III 

MPLFGKSQKS PVELVKSLKEAI NALEAGDRKVEKAQEDVSKNLVS I KNMLHGSSDAEP PA 6 0 


Qy 


64 


E - AVAQLAQEL YS SGLLVTL I ADLQL I DFEGKKDVTQ I FNN I LRRQ I GTRS PTVE Y I SAH 122 

MMMIIIM Ih II M II II III 1 MUM II III II Ml MM 


Db 


61 


DYWAQLSQELYNSNLLLLLIQNLHRIDFEGKKHVALIFNNLLRRQIGTRSPTVEYICTK 12 0 


Qy 


123 


PHILFMLLKGYE--APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

1 Ml h III hill 1 Mill hi lllhl MM Ihllhlllill 

PEILFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFKFFRYVEVSTFDIA 180 


Db 


121 


Qy 


181 


SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQSENYVTKRQSLKLLGELILDR 23 9 
1 1 1 h 1 1 I: 1 1 1 1 II : I I'll- III 1 - 1 • • 1 1 1 1 M 1 1 • 1 1 1 1 1 1 1 1 II • 1 1 1 


Db 


181 


i 1 1 1 1 1 1 1 1 1 1 1 1 1 I'll- IN 1 • 1 • • 1 1 1 1 i M 1 - 1 M 1 1 1 1 1 M • 1 M 
SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 24 0 


Qy 


240 


HNFA I MTKY I S KP ENLKLMMNLLRDKS PN I QFEAFHVF KVF VAS PHKTQ P I VE I LLKNQP 299 

III =lhllhllllllllhh:ll lillllllllllllhhl :||::|||:il 
HNFTVMTRYISEPENLKLMMNMLKEKSRNIQFEAFHVFKVFVANPNKPKPILDILLRNQT 300 


Db 


241 


Qy 


300 


KLI EFLSSFQKERTDDEQFADEKNYLI KQI RDLK 333 

\\^^'^\\'^'-\ M-llll Ml lllllhMI 


Db 


301 


KLVDFLTNFHTDRSEDEQFNDEKAYLIKQIKELK 334 



RESULT 6 
US-10-025-730-5 

; Sequence 5, Application US/10025730 

; Publication No. US20030045466A1 

; GENERAL INFORMATION: 

; APPLICANT: Tang, Y. Tom 

; APPLICANT: Guegler, Karl J. 

; APPLICANT: Corley, Neil C. 

; APPLICANT: Gorgone, Gina A. 

; TITLE OF INVENTION: CALCIUM BINDING PROTEIN 
; FILE REFERENCE: PF-0635 US 

; CURRENT APPLICATION NUMBER: US/ 1 0/025 , 730 

; CURRENT FILING DATE: 2001-12-18 

; PRIOR APPLICATION NUMBER: US/09/190 , 965 

; PRIOR FILING DATE: 1998-11-13 

; NUMBER OF SEQ ID NOS : 5 

SOFTWARE: PERL Program 
; SEQ ID NO 5 

LENGTH: 377 

TYPE: PRT 

ORGANISM: Caenorhabditis elegans 
FEATURE : - 

OTHER INFORMATION: gl255838 
US-10-025-730-5 

Query Match 62.4%; Score 1063.5; DB 15; Length 377; 

Best Local Similarity 60.5%; Pred. No. l.le-88; 

Matches 211; Conservative 53; Mismatches 68; Indels 17; Gaps 3; 



Qy 


4 


Db 


1 


Qy 


50 


Db 


61 


Qy 


107 


Db 


121 


Qy 


167 


Db 


181 






Db 


241 


Qy 


287 


Db 


301 



MP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAM 49 

II II IllhlhUI h: I Ihl III III :||l|:: : 

MPLLFGKSHKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 

KEILCGTNEKEPPTE- - -AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR 106 
I : I : llllllhh: H II I Ml ill! llllhll 



RQ I GTRS PTVE Y I SAHPH I LFMLLKG YEAPQ I ALRCG I MLREC I RHEPLAKI I LFSNQFR 166 

lilllllllllh I I II h:|| I III Ihllll III: lllllhh I 
RQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRESIRHDHLAKIILYSDVFY 180 

DFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

II Ih llhlllhllhl MM :M:|h MM I h II hlllhl 

TFFLYVQSEVFDISSDAFSTFKELTTRHKAI lAEFLDSNYDTFFAQYQNLLNSKNYVTRR 24 0 

QSLKLLGELI LDRHNFAI MTKYI SKPENLKLMMNLLRDKS PN I QFEAFHVFKVFVASPHK 286 

MMMM'hlMMI llllll hlhlll IMIII llhllllllllllhhl 

QSLKLLGELLLDRHNFNTMTKYI SNPDNLRLMMELLRDKSRNI QYEAFHVFKVFVANPNK 300 

TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

Ml Ml M: Ihllll I Mllllll III lllllh:M : 
PKPI SDI LNRNREKLVEFLSEFHNDRTDDEQFNDEKAYLI KQI QEMKSS 34 9 



RESULT 7 

US-10-029-386-32324 

; Sequence 32324, Application US/10029386 

; Publication No. US20030194704A1 

; GENERAL INFORMATION: 

; APPLICANT: Penn, Sharron G. 

; APPLICANT: Rank, David R. 

; APPLICANT: Hanzel , David K. 

; TITLE OF INVENTION: HUMAN GENOME -DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR GENE 

; TITLE OF INVENTION: EXPRESSION ANALYSIS TWO 

FILE REFERENCE: AEOMICA-X-2 
; CURRENT APPLICATION NUMBER: US/lO/029, 386 
; CURRENT FILING DATE: 2001-12-20 
; NUMBER OF SEQ ID NOS : 34288 

SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
; SEQ ID NO 32324 
LENGTH: 820 
TYPE : PRT 

ORGANISM: Homo sapiens 
FEATURE : 

OTHER INFORMATION: MAP TO AC000066.1 

OTHER INFORMATION: EXPRESSED IN HELA, SIGNAL =0.87 
OTHER INFORMATION: EXPRESSED IN ADULT LIVER, SIGNAL =1.4 
OTHER INFORMATION: EXPRESSED IN BRAIN, SIGNAL =1.1 
OTHER INFORMATION: EXPRESSED IN HEART, SIGNAL =1.6 
OTHER INFORMATION: EXPRESSED IN LUNG, SIGNAL =1.3 
OTHER INFORMATION: EXPRESSED IN BONE MARROW, SIGNAL =1.3 
OTHER INFORMATION: SWISSPROT HIT: Q99996, EVALUE O.OOe+00 
US-10-029-386-32324 



Query Match 



7.5%; Score 128.5; DB 12; Length 820; 



Best Local Similarity 20.1%; Pred. No. 0.0072; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 



Qy 


18 


Db 


358 


Qy 


78 


Db 


405 


Qy 


126 


Db 


461 


Qy 


186 


Db 


492 


Qy 


238 


Db 


552 


Qy 


265 


Db 


612 


Qy 


321 


Db 


669 



- mil I II M hh h Ih II I 
lEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 404 



- ■■ III ]■■■■ ■■ ■■■ III h::| II h : 

- 1 SKLKDLQQSLVNSKSEEMTLQI - -NELQKEI EI LRQEEKEKGTLEQEVQELQLKTEL 460 



= 11 I - I :| I I 

)MKEKE NDLQEKFAQLEAEN-SILKDEKK 491 



:hl I I - - h-l -I || hi -I h 

'LEDMLKIHTPVSQEERLI FLDS I KSKSKDSVWEKEI EI LI EENEDLKQQCIQLNEEI E] 

)RHNFAIiyiTK YISKPENLKLMMNLLR] 

h h I I II : I ::| | 

)rntfsfaeknfevnyqelqeeyacllkvkddledsknkqeleyksklkalneelhlqr: 
:spniqfea--fhvfkvfvasphktqpiveillknqpklieflssfqkertd-deqfad 
fpttvkmkssvfdedktfva- - -etlemgewekdttelmeklevtkreklelsqrlsd] 

EKNYLIKQIRDLKK 334 

I -I :::: Ih 



RESULT 8 

US-10-080-608A-11 

; Sequence 11, Application US/10080608A 
; Publication No. US20030198956A1 
; GENERAL INFORMATION: 

; APPLICANT: Makowski, Lee 
; APPLICANT: Hyman, Paul 
; APPLICANT: Williams, Mark 

; TITLE OF INVENTION: STAGED ASSEMBLY OF NANOSTRUCTURES 

; FILE REFERENCE: 8471-010-999 

; CURRENT APPLICATION NUMBER: US/lO/080 , 608A 

; CURRENT FILING DATE: 2002-02-21 

; NUMBER OF SEQ ID NOS : 18 0 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 11 

LENGTH: 3878 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-080-608A-11 

Query Match 7.5%; Score 128.5; DB 12; Length 3878; 

Best Local Similarity 20.1%; Pred. No. 0.065; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 



Qy 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 



Db 


664 


lEKLKDNLGIHYKQ- -QIDGLQNEMSQKIETMQ- - FEKDNLITKQNQLILE 


710 


Qy 


78 


LLVTLIADLQ- -LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 

: III 1- : - III h-l II h : 
- - ISKLKDLQQSLVNSKSEEMTLQI - -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 


125 


Db 


711 


766 


Qy 


126 


LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 
1 U 1 1 1 :| 1 1 


185 


Db 


767 


LEKQMKEKE NDLQEKFAQLEAEN -SI LKDEKK 


797 


Qy 


186 


TF KDLLTRH KVLVADFLE - QNYDT I FEDYEKLLQS ENYVTKRQSLKLLGELI L 237 

Mhl 1 | :: :: |:::| ::| || |:| ::| |: 


Db 


798 


TLEDMLKI HTPVSQEERLI FLDS I KSKSKDSVWEKEI EI LI EENEDLKQQCI QLNEEI EK 857 


Qy 


238 


DRHNFAIMTK YISKPENLKLMMNLLRD 


264 


Db 


858 


h h 1 1 II : 1 ::| 1 
QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 


917 


Qy 


265 


KS PN I QFEA - - FH VFKVF VAS PHKTQ P I VE I LLKNQPKL I EFLS S FQKERTD - DEQFAD - 


320 


Db 


918 


- - 1 1 III :| : h: h :|:| | : :: :| 
NPTTVKMKSSVFDEDKTFVA ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 


974 


Qy 


321 


EKNYLIKQIRDLKK 334 




Db 


975 


1 -1 :::: Ih 
SEQLKQKHGEISFLNEEVKSLKQ 997 





RESULT 9 
US-10-171-311-4 

Sequence 4, Application US/ 1017 13 11 
Publication No. US20030087270A1 
GENERAL INFORMATION: 
APPLICANT: Schlegel, Robert 
Chen, Yan 
Zhao, Xumei 
Monahan , John 
Kama tkar , Shubhangi 
Glatt, Karen 
Ganna va rapu , Man j ul a 
Hoersh, Sebastian 
INVENTION: NOVEL GENES, COMPOSITIONS, KITS, AND METHODS FOR 

IDENTIFICATION, ASSESSMENT, PREVENTION, AND THERAPY 
OF CERVICAL CANCER 
035 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
TITLE OF 
TITLE OF INVENTION 
TITLE OF INVENTION 
FILE REFERENCE: MRI 



CURRENT APPLICATION NUMBER: US/lO/171 , 311 

CURRENT FILING DATE: 2002-06-12 

PRIOR APPLICATION NUMBER: US 60/298,159 

PRIOR FILING DATE: 2001-06-13 

PRIOR APPLICATION NUMBER: US 60/298,155 

PRIOR FILING DATE: 2001-06-13 

PRIOR APPLICATION NUMBER: US 60/335,936 

PRIOR FILING DATE: 2001-11-14 

NUMBER OF SEQ ID NOS: 238 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 4 
LENGTH: 3899 



TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-171-311-4 



Query Match 7.5%; Score 128.5; DB 15; Length 38 99; 

Best Local Similarity 20.1%; Fred, No. 0.066; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 

:iLKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

Mill I II : I hi: h ||: || | 
:KLKDNLGI HYKQ - - Q I DGLQNEMSQKI ETMQ FEKDNLI TKQNQLI LE 698 

78 LLVTLIADLQ--LIDFEGKKDVTQIFISINILRRQI GTRSPTVEYISAHPHI 125 

- ■ MI !:: : :: III |:::| || h : 



I M I I I M I I 

LEKQMKEKE NDLQEKFAQLEAEN -SI LKDEKK 785 

186 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

' '-I'-l I I :: |:::| ::| || |:| ::| |: 



Qy 


18 


Db 


652 


Qy 


78 


Db 


699 


Qy 


126 


Db 


755 


Ov 


X O D 


Db 


786 


Qy 


238 


Db 


846 


Qy 


265 


Db 


906 


Qy 


321 


Db 


963 



I in I ::| I 



I III M : |:: j: :|:| | ::|: : 



-EKNYLIKQIRDLKK 334 
I -I =::: Ih 



RESULT 10 
US-10-171-311-2 

Sequence 2, Application US/10171311 
Publication No. US20030087270A1 
GENERAL INFORMATION: 
APPLICANT: Schlegel, Robert 
APPLICANT: Chen, Yan 
APPLICANT: Zhao, Xumei 
APPLICANT: Monahan, John 
APPLICANT: Kamatkar, Shubhangi 
APPLICANT: Glatt, Karen 
APPLICANT: Gannavarapu, Manjula 
APPLICANT: Hoersh, Sebastian 



TITLE OF INVENTION 



TITLE OF INVENTION 
TITLE OF INVENTION 



NOVEL GENES, COMPOSITIONS, KITS, AND METHODS FOR 



IDENTIFICATION, ASSESSMENT, PREVENTION, AND THERAPY 

OF CERVICAL CANCER 
FILE REFERENCE: MRI-035 

CURRENT APPLICATION NUMBER: US/lO/171 , 311 

CURRENT FILING DATE: 2002-06-12 

PRIOR APPLICATION NUMBER: US 60/298,159 



; PRIOR FILING DATE: 2001-06-13 

; PRIOR APPLICATION NUMBER; US 60/298,155 

; PRIOR FILING DATE: 2001-06-13 

; PRIOR APPLICATION NUMBER: US 60/335,936 

; PRIOR FILING DATE: 2001-11-14 

; NUMBER OF SEQ ID NOS : 238 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 2 

LENGTH: 3 907 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-171-311-2 

Query Match 7.5%; Score 128.5; DB 15; Length 3907; 

Best Local Similarity 20.1%; Pred. No. 0.066; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 

Qy 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

■■■■ Mill I II : I |:|: h |h II I 
Db 652 IEKLKDNLGIHYKQ--QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 698 

Qy 78 LLVTLIADLQ--LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

- UN h: : II I h-l || |: : 

Db 699 --ISKLKDLQQSLVNSKSEEMTLQI--NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 754 

Qy 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

1 =1 I I I :| II 

Db 755 LEKQMKEKE NDLQEKFAQLEAEN-SILKDEKK 785 

Qy 186 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

I '-I'-l I I - - h-l -I II hi ::| h 

Dt> 786 TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 845 

Qy 238 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

h h I I II : I ::| | 

Db 846 QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 905 

Qy 265 KSPNIQFEA--FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEQFAD- 320 

- - I I III :| : |:: h :|:| | ::|: : :: :| 

Db 906 NPTTVKMKSSVFDEDKTFVA---ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 962 

Qy 321 EKNYLIKQIRDLKK 334 

I ::| :::: ||: 
Db 963 SEQLKQKHGEISFLNEEVKSLKQ 985 



RESULT 11 
, US-10-370-685-100 
; Sequence 100, Application US/10370685 
; Publication No. US20030215903A1 
; GENERAL INFORMATION: 
; APPLICANT: Hyman, Paul 
; APPLICANT: Goldberg, Edward 

; TITLE OF INVENTION: Nanostructures Containing PNA Joining and Functional 
Elements 

; FILE REFERENCE: NANF.P-004 

; CURRENT APPLICATION NUMBER: US/lO/370 , 685 



CURRENT FILING DATE: 2003-02-21 
; PRIOR APPLICATION NUMBER: 10/080,608 

PRIOR FILING DATE: 2002-02-21 
; NUMBER OF SEQ ID NOS : 159 
; SOFTWARE: Patentin version 3.2 
; SEQ ID NO 100 

LENGTH: 3911 

TYPE: PRT 

ORGANISM: human 
US-10-370-685-100 

Query Match 7.5%; Score 128.5; DB 12; Length 3911; 

Best Local Similarity 20.1%; Pred. No. 0.066; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 

Qy 18 VKILKDNIiAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

- Mill I II = I hh h l|: || | 
Db 664 IEKLKDNLGIHYKQ--QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 710 

Qy 78 LLVTLIADLQ--LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

- ■■ III I- : - II I |:::| II h = 

Db 711 --ISKLKDLQQSLVNSKSEEMTLQI--NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 766 

Qy 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

I n I I - I :| II 

Db 767 LEKQMKEKE NDLQEKFAQLEAEN - S I LKDEKK 797 

Qy 186 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

I ■■\--\ I I - h-l -I M hi -I h 

Dti 798 TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 857 

Qy 238 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

h |: I I II : I ::| I 

Db 8 58 QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 917 

Qy 265 KSPNIQFEA--FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEQFAD- 320 

- I I III :| : h: h I : - :| 

Db 918 NPTTVKMKSSVFDEDKTFVA---ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 974 

Qy 321 EKNYLIKQIRDLKK 334 

Db 975 SEQLKQKHGEISFLNEEVKSLKQ 997 



RESULT 12 
US-10-171-311-8 

Sequence 8, Application US/10171311 
Publication No. US20030087270A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Schl egel , Robert 
Chen , Yan 
Zhao, Xumei 
Monahan , John 
Kama t ka r , Shubhangi 
Glatt, Karen 
Gannavarapu , Man j ula 
Hoersh, Sebastian 



; TITLE OF INVENTION: NOVEL GENES, COMPOSITIONS, KITS, AND METHODS FOR 

; TITLE OF INVENTION: IDENTIFICATION, ASSESSMENT, PREVENTION, AND THERAPY 

; TITLE OF INVENTION: OF CERVICAL CANCER 

; FILE REFERENCE: MRI-035 

; CURRENT APPLICATION NUMBER: US/10/171,311 

; CURRENT FILING DATE: 2002-06-12 

; PRIOR APPLICATION NUMBER: US 60/298,159 

; PRIOR FILING DATE: 2001-06-13 

; PRIOR APPLICATION NUMBER: US 60/298,155 

; PRIOR FILING DATE: 2001-06-13 

; PRIOR APPLICATION NUMBER: US 60/335,936 

; PRIOR FILING DATE: 2001-11-14 

; NUMBER OF SEQ ID NOS : 238 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 8 

LENGTH: 3917 

TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-171-311-8 



Query Match 7.5%; Score 128.5; DB 15; Length 3 917; 

Best Local Similarity 20.1%; Pred. No. 0.066; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15; 



Qy 


18 


VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 
Mill 1 II : 1 hh :: h lh III 


77 


Db 


652 


I EKLKDNLG I HYKQ - - QI DGLQNEMSQKI ETMQ FEKDNLI TKQNQLI LE 


698 


Qy 


78 


LLVTLIADLQ- -LIDFEGKKDVTQIFNNILRRQI - - -GTRSPTVEYISAHPHI 

: III 1- : - III h-l II h : 
- -ISKLKDLQQSLVNSKSEEMTLQI - -NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 


125 


Db 


699 


754 


Qy 


126 


LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 


185 


Db 


755 


1 n 1 . 1 :: 1 :| | | 
LEKQMKEKE NDLQEKFAQLEAEN -SI LKDEKK 


785 


Qy 


186 


TFKDLLTRH KVLVADFLE -QNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI L 

1 :hl 1 | :: :: |:::| ::| || |:| ::| |: 
TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 


237 


Db 


786 


845 


Qy 


238 


DRHNFAIMTK YISKPENLKLMMNLLRD 

h h 1 1 II : 1 ::| 1 
QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 


264 


Db 


846 


905 


Qy 


265 


KSPNIQFEA--FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEQFAD- 

- - 1 1 III :| = |:: h ^hl 1 -h : - =1 
NPTTVKMKSSVFDEDKTFVA ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 


320 


Db 


906 


962 


Qy 


321 


EKNYLIKQIRDLKK 334 




Db 


963 


1 ::| :::: Ih 
SEQLKQKHGEISFLNEEVKSLKQ 985 





RESULT 13 
US-10-171-311-6 

; Sequence 6, Application US/10171311 
; Publication No. US20030087270A1 



GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
TITLE OF INVENTION 
TITLE OF INVENTION 
TITLE OF INVENTION 
FILE REFERENCE: MRI 



Schlegel, Robert 
Chen, Yan 
Zhao, Xumei 
Monahan , John 
Kamatkar , Shubhangi 
Glatt, Karen 
Gannavarapu , Man j ula 
Hoersh, Sebastian 

NOVEL GENES, COMPOSITIONS, KITS, AND METHODS FOR 
IDENTIFICATION, ASSESSMENT, PREVENTION, AND THERAPY 
OF CERVICAL CANCER • 
035 



CURRENT APPLICATION NUMBER: US/lO/l71,311 
CURRENT FILING DATE: 2002-06-12 
PRIOR APPLICATION NUMBER: US 60/298,159 
PRIOR FILING DATE: 2001-06-13 
PRIOR APPLICATION NUMBER: US 60/298,155 
PRIOR FILING DATE: 2001-06-13 
PRIOR APPLICATION NUMBER: US 60/335,936 
PRIOR FILING DATE: 2001-11-14 
NUMBER OF SEQ ID NOS : 238 
SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 6 
LENGTH: 3925 
TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-171-311-6 



Query Match 7.5%; Score 128.5; DB 15; Length 3925; 

Best Local Similarity 20.1%; Pred. No. 0.067; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 15 

Qy 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

■■■■ Mill I II : I hh h Ih || I 

Db 652 IEKLKDNLGIHYKQ--QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 698 

Qy 78 LLVTLIADLQ--LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

- : III h: : III h-l II h : 

Db 699 --ISKLKDLQQSLVNSKSEEMTLQI--NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 754 

Qy 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

I =1 I I - I :| I I 

Db 755 LEKQMKEKE NDLQEKFAQLEAEN-SILKDEKK 785 

Qy 186 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

I :hl I I :: - h-l -I || |:| -j h 

Db 786 TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 845 

Qy 238 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

h h I I II : i ::| | 

Db 846 QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 905 

Qy 265 KSPNIQFEA--FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEQFAD- 320 

- I I III :| : I- h :|:| I : :: :| 

Db 906 NPTTVKMKSSVFDEDKTFVA ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 962 



Qy 321 EKNYLIKQIRDLKK 334 

I ::| :::: Ih 
Db 963 SEQLKQKHGEISFLNEEVKSLKQ 985 



RESULT 14 

US-09-864-761-47959 

; Sequence 47959, Application US/09864761 

; Patent No. US20020048763A1 

; GENERAL INFORMATION: 

; APPLICANT: Penn, Sharron G. 

; APPLICANT: Rank, David R. 

; APPLICANT: Hanzel , David K, 

; APPLICANT: Chen, Wensheng 

; TITLE OF INVENTION: HIMAN GENOME -DERIVED SINGLE EXON NUCLEIC ACID PROBES 
USEFUL FOR 

; TITLE OF INVENTION: GENE EXPRESSION ANALYSIS BY MICROARRAY 

; FILE REFERENCE: Aeomica-X-1 

; CURRENT APPLICATION NUMBER: US/09/864,761 

; CURRENT FILING DATE: 2001-05-23 

; PRIOR APPLICATION NUMBER: US 6 0/180,312 

; PRIOR FILING DATE: 2000-02-04 

; PRIOR APPLICATION NUMBER: US 60/207,456 

; PRIOR FILING DATE: 2000-05-26 

; PRIOR APPLICATION NUMBER: US 09/632,366 

; PRIOR FILING DATE: 2000-08-03 

; PRIOR APPLICATION NUMBER: GB 24263.6 

; PRIOR FILING DATE: 2000-10-04 

; PRIOR APPLICATION NUMBER: US 60/236,359 

PRIOR FILING DATE: 2000-09-27 
; PRIOR APPLICATION NUMBER: PCT/USOl/00666 
; PRIOR FILING DATE: 2001-01-30 
; PRIOR APPLICATION NUMBER: PCT/USOl/00667 
; PRIOR FILING DATE: 2001-01-30 
; PRIOR APPLICATION NUMBER: PCT/USOl/00664 
; PRIOR FILING DATE: 2001-01-30 
; PRIOR APPLICATION NUMBER: PCT/USOl/ 00669 
; PRIOR FILING DATE: 2001-01-30 
; PRIOR APPLICATION NUMBER: PCT/USOl/00665 
; PRIOR FILING DATE: 2001-01-30 
; PRIOR APPLICATION NUMBER: PCT/USOl/00668 
; PRIOR FILING DATE: 2001-01-30 
; PRIOR APPLICATION NUMBER: PCT/USOl/00663 

PRIOR FILING DATE: 2001-01-30 

PRIOR APPLICATION NUMBER: PCT/USOl/00662 
; PRIOR FILING DATE: 2001-01-30 
; PRIOR APPLICATION NUMBER: PCT/USOl/00661 
; PRIOR FILING DATE: 2001-01-30 

PRIOR APPLICATION NUMBER: PCT/USOl/00670 

PRIOR FILING DATE: 2001-01-30 
; PRIOR APPLICATION NUMBER: US 60/234,687 

PRIOR FILING DATE: 2000-09-21 

PRIOR APPLICATION NUMBER: US 09/608,408 
; PRIOR FILING DATE: 2000-06-30 
; PRIOR APPLICATION NUMBER: US 09/774,203 
; PRIOR FILING DATE: 2001-01-29 
; NUMBER OF SEQ ID NOS : 49117 



SOFTWARE: Annomax Sequence Listing Engine vers. 1.1 
SEQ ID NO 47959 
LENGTH: 660 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 

MAP TO AJ010770.1 
EXPRESSED IN LUNG, SIGNAL = 2 
EXPRESSED IN BT474, SIGNAL =1.1 
SWISSPROT HIT: Q99323, EVALUE 3.00e-17 
EST_HUMAN HIT: AU132932.1, EVALUE l.OOe-105 



OTHER INFORMATION 
OTHER INFORMATION 
OTHER INFORMATION 
OTHER INFORMATION 
OTHER INFORMATION 
US-09-864-761-47959 



Query Match 7.3%; Score 12 5; DB 9; Length 660; 

Best Local Similarity 20.5%; Pred. No. 0.011; 

Matches 75; Conservative 68; Mismatches 119; Indels 104; Gaps 13; 

Qy 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

Mill I II : I hh h Ih II I 
Db 342 IEKLKDNLGIHYKQ--QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 388 

Qy 78 LLVTLIADLQ--LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

:: : ||| |:: : :: || | |:::| II h : 

Db 389 --ISKLKDLQQSLVNSKSEEMTLQI--NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 444 

Qy 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

I H I I I :| I I 

Db 445 LEKQMKEKE NDLQEKFAQLEAEN-SILKDEKK 475 

Qy 186 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

I :hl I I - - h-l -I II hi -l h 

Db 476 TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 535 

Qy 238 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

h h I I II : I ::| I 

Db 536 QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 595 

Qy 265 KSPNIQFEA--FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEK 322 

:: :: I | ||| :| : |:: |: :|:| | ::|: : | : 
Db 596 NPTTVKMKSSVFDEDKTFVA ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 652 

Qy 323 NYLIKQ 328 

: :|| 

Db 653 SEQLKQ 658 



RESULT 15 
US-10-023-634-18 

Sequence 18, Application US/10023634 
Publication No. US20030236389A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Shimkets, Richard A 
Colman, Steven D 
Spytek, Kimberly A 
Bal linger, Robert A 
Guo, Xiaojia 
Tchernev, Velizar T 



APPLICANT: Shenoy, Suresh G 
APPLICANT: Li, Li 
APPLICANT: Ellerman, Karen 
APPLICANT: Zerhusen, Bryan D 
APPLICANT: Patturajan, Meera 
APPLICT^T: Casman, Stacie J 
APPLICANT: Boldog, Ferenc 
APPLICANT: Gusev, Vladimir Y 
APPLICANT: Burgess, Catherine E 
APPLICANT: Edinger, Shlomit R 
APPLICANT: Gangolli, Esha A 
APPLICANT: Malyankar, Uriel M 
APPLICANT: Gunther, Erik 
APPLICANT: Smithson, Glennda 
APPLICANT: Millet, Isabella 
APPLICANT: Gerlach, Valerie 

TITLE OF INVENTION: Proteins, Polynucleotides Encoding Them and Methods of 
TITLE OF INVENTION: Using the Same 
FILE REFERENCE: 21402-221 

CURRENT APPLICATION NUMBER: US/lO/023 , 634 
CURRENT FILING DATE: 2002-06-28 
PRIOR APPLICATION NUMBER: 60/256,025 
PRIOR FILING DATE: 2000-12-15 
PRIOR APPLICATION NUMBER: 60/265,163 
PRIOR FILING DATE: 2001-01-30 
PRIOR APPLICATION NUMBER: 60/272,929 
PRIOR FILING DATE: 2001-03-02 
PRIOR APPLICATION NUMBER: 60/274,864 
PRIOR FILING DATE: 2001-03-09 
PRIOR APPLICATION NUMBER: 60/276,688 
PRIOR FILING DATE: 2001-03-16 
PRIOR APPLICATION NUMBER: 60/277,880 
PRIOR FILING DATE: 2001-03-22 
PRIOR APPLICATION NUMBER: 60/286,409 
PRIOR FILING DATE: 2001-04-25 
PRIOR APPLICATION NUMBER: 60/309,246 
PRIOR FILING DATE: 2001-07-31 
PRIOR APPLICATION NUMBER: 60/315,600 
PRIOR FILING DATE: 2001-08-29 
NUMBER OF SEQ ID NOS: 132 
SOFTWARE: Patent In Ver. 2.1 
SEQ ID NO 18 
LENGTH: 709 
TYPE : PRT 

ORGANISM: Homo sapiens 
US-10-023-634-18 



Query Match 6.8%; Score 116.5; DB 12; Length 709; 

Best Local Similarity 19.6%; Pred. No. 0.073; 

Matches 79; Conservative 64; Mismatches 146; Indels 115; Gaps 12; 

Qy 3 KMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 62 

Db 179 KLQVTQRSLEESQGKIAQLEGKLVSIEKE--KIDEKS-ETEKLLEYIEEISCASDQVEKY 235 

Qy 63 TEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 122 

■■\\\ ■■ \ : i I I I I II 



Db 236 KLDIAQLEENL KEKNDEILSLKQSLEENIVILSKQVE 272 

Qy 123 PHILFMLLKGYEAPQIALRCGIMLRECIRH EPLAKI ILFSNQFRD 167 

: ::| :: :| I . | , . | |, 

Db 273 DLNVKCQLLEKEKEDHVNRNREHNENLNAEMQNLKQKFILEQQERE 318 

Qy 168 FFKYVELSTFDIAS DAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLL 217 

: II : :|:: : : I I : I : | | : : : j 

Db 319 KLQQKELQIDSLLQQEKELSSSLHQKLCSFQEEMVKEKNLFEEELKQTLDELDKLQQKEE 378 

Qy 218 QSENYV TKRQSLKLLGELI LDRHNFA IMTKY 248 

hi I - : MM I : h: : I : M 

Db 379 QAERLVKQLEEEAKSRAEELKLLEEKLKGKEAELEKSSAAHTQATLLLQEKYDSMVQSLE 438 

Qy 249 ISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILL 295 

h hMI : |::h II M: II Ml 

Db 439 DVTAQFEGYKALTASEIEDLKLENSSLQEKAAKAGKNAEDVQHQILATESSNQEYVRMLL 498 

Qy 296 KNQPK LIEFLSSFQKERTD-DEQFADEKNYLIKQIRD 331 

II : I II II I :: ||: | 

Db 499 DLQTKSALKETEIKEITVSFLQKITDLQNQLKQQEEDFRKQLED 542 



Search completed: January 1, 2004, 16:52:26 
Job time : 37 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



January 7, 2004, 16:44:17 ; Search time 21 Seconds 

(without alignments) 
1543.278 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-10-088-872-2 
1704 

1 MKKMPLFSKSHKNPAEIVKI . 



.FADEKNYLIKQIRDLKKTAP 337 



Scoring table: BL0SUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 283308 seqs, 96168682 residues 

Total number of hits satisfying chosen parameters: 283308 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post -processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : PIR_76:* 

1: pirl:* 

2: pir2:* 

3: pir3:* 

4: pir4:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

o. 

Result Query 

No. Score Match Length DB ID Description 



1 


1376 


80 


.8 


,341 


2 


157997 


hypothetical calci 


2 


1063 .5 


62 


.4 


377 


2 


T16651 


hypothetical prote 


3 


1006.5 


59 


.1 


338 


2 


T27129 


hypothetical prote 


4 


834.5 


49 


0 


329 


2 


T50117 


mo25 homolog [impo 


5 


685 


40 


2 


305 


2 


G71441 


hypothetical prote 


6 


632 


37 


1 


348 


2 


B84448 


hypothetical prote 


7 


485 


28 


5 


399 


2 


S34681 


hypothetical prote 


8 


143.5 


8 


4 


339 


2 


T33477 


hypothetical prote 


9 


134.5 


7 


9 


677 


2 


H64574 


DNA topoisomerase 


10 


128 


7 


5 


430 


2 


H64709 


hypothetical prote 


11 


125.5 


7 


4 


298 


2 


B71685 


hypothetical prote 


12 


125.5 


7 


4 


1642 


2 


T08880 


NMDA receptor-bind 


13 


123.5 


7, 


2 


1285 


2 


B72420 


hypothetical prote 



14 


120 


7 


0 


1175 


2 


F6448 9 


hypothet ica.1 pirotG 


15 


118 . 5 


7 


0 


959 


2 


T00246 


DNA polymeirase V - 


16 


115 


6 


7 


474 


2 


S71322 


aliit"rit"hionp svnt'ha 

^ \.A 1^ \^ \^ y XX K^xx\^ 


17 


113 . 5 


6 


7 


833 


2 


T43446 


hivnn1"hipt" 1 Pr5 1 ■nfot'P 

1 A y W 1_ -1- d -1- Lyj_WO\^ 


18 


112.5 


a 
\j 




1411 

J. *± J. X 


o 
£i 


O ^ J X ^ ^ 


lly UlICl^ X X ^XLJUC 


19 


111-5 


5 


5 


79 S 


\ 


U J W X D 


iiy d X Lix kJiidii xcL..c^L>w 


20 


111 . 5 


6 


5 


2401 


2 


T28676 


T*hODl"l"V Dr'Ol"pin - 


21 


111 




5 


2166 


2 


\J 1 V X VJ J 


ii_y uiici^ X i_cix ^xuuc 


22 


111 


6 


5 


2819 


2 


A90551 


pnn*=!PTVPfi hivr)nt~hiet 

V— ' J. J. ^ V v.'^A A J. y k^\_/ L^xxv^ ^ 


23 


109 . 5 


6 


4 


457 


2 


C82911 


Vivnnl" VipI" "i r*;^ 1 n'Pot'P 

iijr Uiiic^ c X X ^xwi.'C 


24 


109 . 5 




4 


978 


2 


A70387 




25 


109 . 5 


6 


4 


1830 


2 


E82909 


\_ J. X O ^ -1- V v^^wl X X^ ^_/V_/ t^XXt^ 


26 


109 


6 


4 


695 


2 


T07283 


livnnt* Vipt" 1 1 "DTOt^p 

XX y k> W l-.XXti X. \_ (3. -L. L^J_V_^L^V^ 


27 


109 




4 


1401 


2 


S11527 


OX^ilCt X CI L> X U <^^X11 ^ 


28 


108 . 5 




4 


442 


2 


T18507 


lTVTnn1~lnpt* 1 1 'nT*f~)l"P 

liy L^lb^ L^lld^ X ^CL X L,/XV_/UC 


29 


108 . 5 


5 


4 


952 


2 


T50451 


lT\/Tno1~lnp1~ "i r'^'] r*r\i 1 p 




X V/ O . J 


D 


4 


-L X O O 


2 


UD ^ O X J 


uyjjc J. X tr o L X X L- U XUli 


31 


108 




3 


568 


2 


S73254 


X ^^X X ^d L« Xlh>lX XlCXXv^Cl 


32 


107 . 5 




3 


483 


2 


140055 


innc! "i 1" 1 VP i~ T';^ n Q — 3 t" 

^^^OXL^XVC L>XClllO d L. 


33 


107,5 




-3 


ft 

O wJ J 


2 


rj vj X \j D 


X lll^JOX U Xil JJCUCt X OU 


34 


107 . 5 


6 


3 


1042 


2 


G64514 


t'vnp T "TP^^I" T i Pt~ i nn 

jr kv X. J- o L. J- -U X. X 


35 


107 . 5 


6 


3 


1726 


1 


SAZQGM 


tnajoiT tneirozoite su 


36 


107 . 5 


6 


3 


1726 


2 


A45948 


mann'r mRyn7nii~p cjii 

1 1 IGL I ^ X- III ^ — - -i_ ^ — J £-i V — ' _1_ vl^ 1^ \Ji 


37 


107 


6 


3 


570 


2 


S68686 

fcj \J d \J o u 


^lll-^iD^ili^^i. ^ L. C- Xii ^ilV^ 


38 


107 


6 


3 


1173 


2 


T43527 


SDft DTDt'Pin - fic3Q 


39 


107 


6 


3 


1727 


2 


T50073 


myosin- like coiled 


40 


106 


6 


2 


474 


2 


S56748 


glutathione syntha 


41 


106 


6 


2 


1295 


2 


T24587 


hypothetical prote 


42 


105.5 


6 


2 


781 


2 


T00456 


protein kinase horn 


43 


105.5 


6 


2 


847 


2 


A56039 


GTPase -activating 


44 


105.5 


6 


2 


1091 


2 


T34107 


hypothetical prote 


45 


105.5 


6 


2 


1619 


2 


T18499 


hypothetical prote 



ALIGNMENTS 



RESULT 1 
157997 

hypothetical calcium-binding protein - mouse 
C; Species: Mus sp . (mouse) 

C;Date: 02-Aug-1996 #sequence_revision 02-Aug-1996 #text__change 19-May-2000 
C;Accession: 157997 

R;Miyamoto, H. ; Matsushiro, A.; Nozaki, M. 
Mol . Reprod. Dev. 34, 1-7, 1993 

A; Title: Molecular cloning of a novel mRNA sequence expressed in cleavage stage 
mouse embryos . 

A; Reference number: 157997; MUID: 93119656 ; PMID:8418809 

A /Accession : 157 997 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 

A;Residues: 1-341 <RES> 

A; Cross-references: GB:S51858; NID:g262933; PIDN:AAB24801 . 1; PID:g262934 
C; Super family: Saccharomyces hypothetical protein YKL189w 
C; Keywords: calcium binding 



Query Match 



80.8%; Score 1376; DB 2; Length 341; 



Best Local Similarity 80.7%; Pred. No. 7.9e-85; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps 2 



Qy 4 MPL-FSKSHKNPAEIVKILKDNIAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I lllhlhill lh::hlMI III :|hllllhl llllll lllll 
Db 1 MPFPFGKSHKSPADIVKNLKESMAVLEKQDI SDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II lllllllllllhllll Ihllllllllllllll lllllllllllllhllllll 

Db 61 EPQTEAVAQLAQELYNSGLLGTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 120 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

HllllllllhhIII lllllllllllllllllihl II llhllhlllil 
Db 121 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 180 

Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

Illllllllllllllhl hlllhll I :|||ll llllllllllllllllhlll 
Db 181 ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240 

Qy 240 HNPAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

III llllllllilllillllllllll llllllllllllllhhIlllhHIIIII 

Db 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQT 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

Illllll II :|hllll III Ihlllhlh I 
Db 301 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRNLKRAA 337 



RESULT 2 
T16651 

hypothetical protein R02E12.2 - Caenorhabditis elegans 
C;Species: Caenorhabditis elegans 

C;Date: 20-Sep-1999 #sequence_revision 20-Sep-1999 #text_change 18-Feb-2000 
C; Access ion: T16651 
R;Leimbach, D. 

submitted to the EMBL Data Library, April 1996 

A; Description: The sequence of C. elegans cosmid R02E12 . 

A; Reference number: Z18554 

A; Access ion: T16651 

A;Status: preliminary; translated from GB/EMBL/DDBJ 

A; Molecule type: DNA 
A;Residues: 1-377 <LEI> 

A; Cross-references: EMBL:U53337; NID:gl255833 ; PID:gl255838 ; PIDN : AAA96187 . 1 ; 

GSPDB:GN00028; CESP : R02E12 . 2 

A; Experimental source: strain Bristol N2; clone R02E12 

C;Genetics : 

A; Gene: CESP : R02E12 . 2 

A; Map position: X 

A;Introns: 37/3; 146/2; 225/1; 315/3 

C; Super family: Saccharomyces hypothetical protein YKL18 9w 

Query Match 62.4%; Score 1063.5; DB 2; Length 377; 

Best Local Similarity 60.5%; Pred. No. 7e-64; 

Matches 211; Conservative 53; Mismatches 68; Indels 17; Gaps 3 

Qy 4 MP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAM 4 9 

II II lllhlh:|l h: I Ih.l III Ml :|Mh: : 



Db 1 MPLLFGK5HKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVAMI 60 

Qy 50 KEILCGTNEKEPPTE---AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR 106 

I M : II :| IIIMIhl:: H II I :|| Mil IIMhIl 

Db 61 KSFIYGNDSAEPSSEHWQVAQLAQEVYNANILPMLIKMLPKFEFECKKDVGQIFNNLLR 120 

Qy 107 RQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

IIMIIIIIIIh M II |::|| I III l|:MII III: jjlilhh I 
Db 121 RQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRESIRHDHLAKIILYSDVFY 180 

Qy 167 DFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKR 226 

II Ih llhlllhllhl MM ::|:||: |||| I h || |:||||:| 
Db 181 TFFLYVQSEVFDISSDAFSTFKELTTRHKAIIAEFLDSNYDTFFAQYQNLLNSKNYVTRR 240 

Qy 227 QSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHK 286 

iiiiiiiihiiiiii MINI hihiii mill iihiiiiiimihhi 

Db 241 QSLKLLGELLLDRHNFNTMTKYI SNPDNLRLMMELLRDKSRNI QYEAFHVFKVFVANPNK 3 00 

Qy 287 TQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

Mi Ml M: IIMIII I Mllllll III lllll|::M ^ 
Db 301 PKPISDILNRNREKLVEFLSEFHNDRTDDEQFNDEKAYLIKQIQEMKSS 349 



RESULT 3 

T27129 

hypothetical protein Y53C12A,4 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 21-Jul-2000 

C; Accession: T2712 9 

R; Kershaw, J.; Lennard, N. 

submitted to the EMBL Data Library, September 1997 
A;Reference number: Z20315 

A;Accession: T27129 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A /Molecule type: DNA 
A/Residues : 1-338 <WIL> 

A; Cross-references: EMBL: Z99277 ; PIDN : CAB16486 . 1 ; GSPDB : GN0002 0 ; CESP : Y53C12A. 4 
A; Experimental source: clone Y53C12A 
C; Genetics : 

A; Gene: CESP: Y53C12A. 4 
A;Map position: 2 

A;Introns: 29/3; 103/3; 136/2; 215/1; 282/3 

C; Super family: Saccharomyces hypothetical protein YKL189w 

Query Match 59.1%; Score 1006,5; DB 2; Length 338; 

Best Local Similarity 57.2%; Pred. No. 3.9e-60; 

Matches 191; Conservative 60; Mismatches 78; Indels 5; Gaps 1; 

Qy 5 PLFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAMKEILCGTNEK 59 

Mi h I l!::|| hi I :::: -j :|| ||:| | | : |:: 

Db 4 PLFGKADKTPADWKNLRDALLVIDRHGTNTSERKVEKAIEETAKMLALAKTFIYGSDAN 63 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II I I Mllhh: M II I Ml MM MIIMIIIIIIIIIIIIh 

Db 64 EPNNEQVTQLAQEVYNANVLPMLIKHLHKFEFECKKDVASVFNNLLRRQIGTRSPTVEYL 123 



Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 



Db 


124 


1 1 II II III 1 ill II llll III II 1 S 1 1 II 1 ill 

Mill 1 1 1 1 1 1 1 1 1 1 1 M 1 1 H II Ihhhl h II :h III 

AARPEILITLLLGYEQPDIALTCGSMLREAVRHEHLARIVLYSEYFQRFFVFVQSDVFDI 183 


Qy 


180 


ASDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI LDR 239 

hllhllllhhil : h:h III 1 1 1 lllllhlllllllllhlll 
ATDAFSTFKDLMTKHKNMCAEYLDNNYDRFFGQYSALTNSENYVTRRQSLKLLGELLLDR 243 


Db 


184 


Qy 


240 


HNFA I MTKY I S KPENLKLMMNLLRDKS PN I QFEAFHVFKVF VAS PHKTQ P I VE I LLKNQP 299 


Db 


244 


II h 1 III: Mill :| Mill MIMIIIIMM II MM Ml Ml M: 

HNFSTMNKYITSPENLKTVMELLRDKRRNIQYEAFHVFKIFVANPNKPRPITDILTRNRD 3 03 


Qy 


300 


KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

MMIhM MIMIII Ml IIMIhM: 


Db 


304 


KLVEFLTAFHNDRTNDEQFNDEKAYLI KQI QELR 337 



RESULT 4 

T50117 

mo25 homolog [imported] - fission yeast (Schizosaccharomyces pombe) 
C; Species: Schizosaccharomyces pombe 

C;Date: 09-Jun-2000 #sequence_revision 09-Jun-2000 #text_change 28-Jul'2000 
C; Access ion: T5 0117 

R;Seeger, K. ; Harris, D. ; Wood, V.; Rajandream, M.A.; Barrell, E.G. 
submitted to the EMBL Data Library, February 2000 

A; Reference number: Z25039 
A /Access ion: T50117 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A/Residues : 1-329 <SEE> 

A; Cross -references: EMBL: AL157734 ; PIDN : CAB75774 .1; GSPDB : GN00066 ; 
SPDB:SPAC1834.06c 

A; Experimental source: strain 972h(-); cosmid cl834 

C;Genetics : 

A; Gene: SPDB : SPAC1834 . 06c 
A/Map position: 1 
A;Introns: 34/3; 185/3 

C;Superfamily: Saccharomyces hypothetical protein YKL189w 

Query Match 4 9.0%; Score 834.5; DB 2; Length 329; 

Best Local Similarity 51.5%; Pred. No. 1.2e-48; 

Matches 169; Conservative 63; Mismatches 93; Indels 3; Gaps 2; 

Qy 6 LFSKSHKNPAEIVKILKDNLAILE-KQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTE 64 

MM |: I III II Ml |: Mill II :: |||| j || : 

Db 4 LFNKRPKSTQDVVRCLCDNLPKLEINNDKK--KSFEEVSKCLQNLRVSLCGTAEVEPDAD 61 

Qy 65 AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPH 124 

M h :M I I h I :MI III l|: MM : M Mhh III 
Db 62 LVSDLSFQIYQSNLPFLLVRYLPKLEFESKKDTGLIFSALLRRHVASRYPTVDYMLAHPQ 121 

Qy 125 ILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAF 184 

I M: I :M I Mill Ml I -M llll:: hlhllMI 
Db 122 IFPVLVSYYRYQEVAFTAGSILRECSRHEALNEVLLNSRDFWTFFSLIQASSFDMASDAF 181 

Qy 185 ATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFAI 244 

MM :| II l|:|: ::| h | I h II M I I II M I I I M h : I M I :: 
Db 182 STFKSILLNHKSQVAEFISYHFDEFFKQYTVLLKSENYVTKRQSLKLLGEILLNRANRSV 241 



Qy 245 MTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEF 304 

Ihlll llllill mill llllllllllhllhl h: ::|ll :h III = 
Db 242 MTRYISSAENLKLMMILLRDKSKNIQFEAFHVFKLFVANPEKSEEVIEILRRNKSKLISY 301 

Qy 305 LSSFQKERTDDEQFADEKNYLIKQIRDL 332 

Ihl :| Ulll Ih ::|lll I 
Db 302 LSAFHTDRKNDEQFNDERAFVIKQIERL 329 



RESULT 5 
G71441 

hypothetical protein - Arabidopsis thaliana 

C; Species: Arabidopsis thaliana (mouse-ear cress) 

A; Variety: Columbia 

C;Date: 03-Aug-1998 #sequence_revision 03-Aug-1998 #text_change 18-Aug-2000 
C; Access ion: G71441 

R;Bevan, M. ; Bancroft, I.; Bent, E.; Love, K. ; Goodman, H. ; Dean, C. ; Bergkamp, 
R. ; Dirkse, W. ; Van Staveren, M,; Stiekema, W. ; Drost, L. ; Ridley, P,; Hudson, 
S-A.; Patel, K.; Murphy, G. ; Piffanelli, P.; Wedler, H. ; Wedler, E.; Wambutt, 
R.; Weitzenegger, T. ; Pohl, T.M.; Terryn, N.; Gielen, J,; Villarroel, R.; De 
Clerck, R.; Van Montagu, M. ; Lecharny, A.; Auborg, S.; Gy, I.; Kreis, M. ; Lao, 
N.; Kavanagh, T. ; Hempel, S.; Kotter, P.; Entian, K.D. ; Rieger, M.; Schaeffer, 
M . ; Funk , B . 

Nature 391, 485-488, 1998 

A/Authors: Mueller-Auer, S.; Silvey, M. ; James, R. ; Montfort, A.; Pons, A. ; 
Puigdomenech, P.; Douka, A.; Voukelatou, E.; Milioni, D.; Hatzopoulos, P.; 
Piravandi, E.; Obermaier, B.; Hilbert, H.; Duesterhoft, A.; Moores, T. ; Jones, 
J.D.G.; Eneva, T. ; Palme, K. ; Banes, V.; Rechman, S.; Ansorge, W.; Cooke, R. ; 
Berger, C. ; Delseny, M.; Voet, M. ; Volckaert, G. ; Mewes, H.W. ; Klosterman, S.; 
Schueller, C; Chalwatzis, N. 

A;Title: Analysis of 1 . 9 Mb of contiguous sequence from chromosome 4 of • 

Arabidopsis thaliana. 

A; Reference number: A71400; MUID : 98121113 ; PMID:9461215 
A, -Access ion: G71441 

A/Status : preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A;Residues: 1-305 <BEV> 

A; Cross-references: GB:Z97343; NID : g2245073 ; PID:e327051; PID:g2245086 
C; Genetics: 

A;Map position: 4COP9-4G3845 

C;Superfamily: Saccharomyces hypothetical protein YKL189w 

Query Match 40.2%; Score 685; DB 2; Length 305; 

Best Local Similarity 45.9%; Pred. No. l.le-38; 

Matches 135; Conservative 68; Mismatches 89; Indels 2; Gaps 2; 

Qy 41 EVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQI 100 

hllh: :| II I :| II II III II : : I h I - I :|| Ih 
Db 8 ELSKSIRDLKLILYGNSEAEPVAEACAQLTQEFFKADTLRRLLTSLPNLNLEARKDATQV 67 

Qy 101 FNNILRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIIL 160 

h hh :| :h - h hi HI I I lllllh HI =1 

Db 68 VANLQRQQVNSRLIAADYLESNIDLMDFLVDGFENTDMALHYGTMFRECIRHQIVAKYVL 127 

Qy 161 FSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDY-EKLLQS 219 

I II h:i Nihil lllhllllll Ihl! :|| I II llhl 



Db 



128 DSEHVKKFFYYIQLPNFDIAADAAATFKELLTRHKSTVAEFLIKNEDWFFADYNSKLLES 187 



Qy 220 ENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKV 279 

lhhlh:||lh::||| I hlllhl Hh^UMIh I II llllllh 
Db 188 TNYITRRQAIKLLGDILLDRSNSAVMTKYVSSMDNLRILMNLLRESSKTIQIEAFHVFKL 247 

Qy 280 FVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

llh n I Ih h Ih h : :|hl :| --I \ \ 

Db 248 FVANQNKPSDIANILVANRNKLLRLLADIKPDK-EDERFDADKAQWREIANLK 300 



RESULT 6 
B84448 

hypothetical protein At2g03410 [imported] - Arabidopsis thaliana 
C;Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Feb-2001 #sequence_revision 02-Feb-2001 #text_change 16-Feb-2001 

C; Access ion: B84448 

R;Lin, X.; Kaul , S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD. ; 
Fujii, C,Y.; Mason, T,M,; Bowman, C.L.; Barnstead, M.E.; Feldblyum, T.V. ; Buell, 



C.R.; Ketchum, K.A. ; Lee, J.J.; Ronning, CM.; Koo, H.; Moffat, K.S.; Cronin, 
L,A, ; Shen, M.; VanAken, S.E.; Umayam, L. ; Tallon, L. J. ; Gill, J.E.; Adams, 
M.D.; Carrera, A. J.; Creasy, T.H.; Goodman, H.M.; Somerville, CR.; Copenhaver, 
G.P.; Preuss, D.; Nierman, W.C; White, 0.; Eisen, J.A. ; Salzberg, S.L.; Fraser, 
CM. ; Venter, J.C 



Nature 402, 761-768, 1999 

A;Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis 
thaliana . 

A;Reference number: A84420; MUID:20083487; PMID : 10617197 
A /Access ion: B8444 8 
A; Status : preliminary 
A; Molecule type: DNA 
A;Residues: 1-348 <STO> 

A; Cross-references: GB:AE002093; NID:g4335758 ; PIDN : AAD17435 . 1 ; GSPDB :GN00139 
C;Genetics : 
A;Gene: At2g03410 
A; Map position: 2 

C;Superfamily: Saccharomyces hypothetical protein YKL189w 

Query Match 37.1%; Score 632; DB 2; Length 348; 

Best Local Similarity 38,7%; Pred. No. 4.4e-35; 

Matches 133; Conservative 80; Mismatches 117; Indels 14; Gaps 5; 

Qy 6 LFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASE EVSKSLQAMKEILCGTNE 58 

II : 1 llh :| :h I --11 : h UNI I 

Db 4 LFKNKSRLPGEIVRQTRDLIALAESEEEETDARNSKRLGICAELCRNIRDLKSILYGNGE 63 

Qy 59 KEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEY 118 

II II I II : : I II : :| I :|| ||| h -h I II 
Db 64 AEPVPEACLLLTQEFFRADTLRPLIKSIPKLDLEARKDATQIVANLQKQQVEFRLVASEY 123 

Qy 119 ISAHPHILFMLLKGYEAP-QIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTF 177 

^ - - l-n ::|| Ihlhlh :|l II I II Ihl I 

Db 124 LESNLDVIDSLVEGIDHDHELALHYTGMLKECVRHQWAKYILESKNLEKFFDYVQLPYF 183 

Qy 178 DIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYE-KLLQSENYVTKRQSLKLLGELI 236 

hhll |::|||||| \\:-\ I H llh :| ||||: ||||::: 

Db 184 DVATDASKIFRELLTRHKSTVAEYIiAKNYEWFFAEYNTKLLEKGSYFTKRQASKLLGDVL 243 



Qy 237 LDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLK 296 

HI I :| Ihl :|h:||||l|: ■ III lllhlhilj: :| : II Ih 
Db 244 MDRSNSGVMVKYVSSLDNLRIMMNLLREPTKNIQLEAFHIFKLFVANENKPEDIVAILVA 303 

Qy 297 NQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK KTA 336 

|: |:: : : |: :| | :| :| | ||| 

Db 304 NRTKILRLFADLKPEK-EDVGFETDKALVMNEIATLSLLDIKTA 346 



RESULT 7 
S34681 

hypothetical protein YKL189w - yeast (Saccharomyces cerevisiae) 

C; Species: Saccharomyces cerevisiae 

C;Date: 30-Sep-1993 #sequence_revision 30-Sep-1993 #text_change 19-Apr'2002 
C;Accession: S34681; S33963; S38021; S38026 

R;Wieman, S.; Voss, H.; Schwagaer, C; Rupp, T. ; Stegemann, J.; Zimmermann, J. 
Grothues, D.; Sensen, C. ; Erfle, H. ; Hewitt, N. ; Banrevi, A.; Ansorge, W. 
submitted to the EMBL Data Library, July 1993 

A/Description: Sequencing and analysis of 51.5 kilobases on the left arm of 

chromosome XI from Saccharomyces cerevisiae reveals 23 open reading frames 

including the FASl gene. 

A; Reference number: S3467 9 

A; Access ion: S34 681 

A; Molecule type: DNA 

A/Residues: 1-399 <WIE> 

A; Cross-references: EMBL :X74 151 ; NID:g450365; PIDN : CAA52249 . 1 ; PID:g395236 
A; Experimental source: strain S288C 
R;Cheret, G.; Mattheakis, L.C.; Sor, F. 
Yeast 9, 661-667, 1993 

A; Title: DNA sequence analysis of the YCN2 region of chromosome XI in 
Saccharomyces cerevisiae. 

A /Reference number: S33960; MUID: 93348778 ; PMID:8394042 
A; Access ion: S33 963 
A; Molecule type: DNA 
A/Residues : 1-399 <CHE> 

A; Cross -references: GB:X69765; NID:g296985; PIDN : CAA4 9422 . 1 ; PID:g296989 
R;Wiemann, S.; Voss, H.; Schwager, C. ; Rupp, T. ; Grothues, D. ; Sensen, C; 
Stegemann, J.; Zimmermann, J.; Erfle, H.; Hewitt, N.; Ansorge, W. 
submitted to the Protein Sequence Database, March 1994 
A; Reference number: S37825 
A /Accession: S38 021 
A /Molecule type: DNA 
A;Residues: 1-399 <WI2> 

A; Cross-references: EMBL:Z28189; NID:g486334; PIDN: CAA8 2032 . 1 ; PID: g486335 ; 
MIPS:YKL189w 

A; Experimental source: strain S288C 

R;Maia e Silva, A.; Bossier, P.; Vilela, C. ; Fernandes, L. ; Scares, H. ; 

Guerreiro, P.; Rodrigues-Pousada, C. 

submitted to the Protein Sequence Database, March 1994 
A/Reference number: S38024 
A; Access ion: S3 8 026 
A;Molecule type: DNA 
A;Residues: 1-399 <MAI> 

A; Cross -references: EMBL:Z28189; NID:g486334; PIDN: CAA82 032 . 1 ; PID:g486335; 
MIPS: YKL189W 

A; Experimental source: strain S288C 



C;Genet ics : 

A; Gene: SGDiHYMl 

A; Cross -references : SGD:S0001672 
A;Map position: IIL 

C;Superfamily : Saccharomyces hypothetical protein YKL18 9w 

Query Match 28.5%; Score 485; DB 2 ; Length 399; 

Best Local Similarity 33.0%; Pred. No. 3.6e-25; 

Matches 113; Conservative 75; Mismatches 138; Indels 16; Gaps 6; 



Qy 


7 


FSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 


62 


Db 


16 


: h 1 1- : 1 II 1 1 II U 1 | : | : j 
WKKNPKTPSDYARLIIEQLNKFSSPSLTQDNKR-KVQEECTKYLIGTKHFIVGDTDPHPT 


74 


Qy 


63 


TEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 

||: :| :: : : |: ::|| ::: ||: | : Ihh : 
PEAI DELYTAMHRADVFYELLLHFVDLEFEARRECMLI FS I CLGYSKDNKFVTVDYLV^O 


122 


Db 


75 


134 


Qy 


123 


PHILFMLLKGYE APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELS 

1 : - h 1 1 1 1 h Mh:| 1 :|ll 1 11- :| 


175 


Db 


135 


PKTI SLMLRTAEVALQQKGCQDI FLTVGNMI I ECI KYEQLCRI I LKDPQLWKFFEFAKLG 


194 


Qy 


176 


TFDIASDAFATFKDLLTRHKVLVA~DFL--EQNYDTIFEDYEKLLQSENYVTKRQSLKLL 


232 


Db 


195 


hi:::: 1 1 Ih H II : Ih Ullllll III 
NFEISTESLQILSAAFTAHPKLVSKEFFSNEINIIRFIKCINKLMAHGSYVTKRQSTKLL 


254 


Qy 


233 


GELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVE 
Ih 1 1 hi Ih lllllhl h III hi llhllll Ihl h:h : 


292 


Db 


255 


ASLIVIRSNNALMNIYINSPENLKLIMTLMTDKSKNLQLEAFNVFKVMVANPRKSKPVFD 


314 


Qy 


293 


ILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKK 334 
Ihlh Ih : :| : : 1 1 \\: :::::| | : 




Db 


315 


I LVKNRDKLLTYFKTFGLD - SQDSTFLDEREF I VQE I DSLPR 355 





RESULT 8 
T33477 

hypothetical protein T27C10,3 - Caenorhabditis elegans 
C;Species: Caenorhabditis elegans 

C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 #text_change 29-Oct-1999 

C; Access ion: T33477 

R;Zhu, H.J.; Graves, T. ; Hawkins, M. 

submitted to the EMBL Data Library, October 1998 

A; Description : The sequence of C. elegans cosmid T27C10. 

A; Reference number: Z21354 

A;Accession: T33477 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-339 <ZHU> 

A; Cross -references: EMBL:AF098504 ; PIDN : AAC67411 . 1 ; GSPDB :GN00019 ; CESP : T27C10 . 3 

A; Experimental source: strain Bristol N2; clone T27C10 

C;Genetics : 

A;Gene: CESP : T27C10 . 3 

A;Map position: 1 

A;Introns: 72/3; 120/3; 233/3; 295/1 



Query Match 



8.4%; Score 143.5; DB 2; Length 339; 



Best Local Similarity 19.3%; Pred. No. 0.02; 

Matches 38; Conservative 50; Mismatches 76; Indels 33; Gaps 4; 

Qy 159 ILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ 218 

- :hlll Ih I I : : : I : Ih 

Db ICQ LMNTNKFRD FDVIQGTFDTLQI IFFTNHESANNFIKNNLPRFMQTLHKLIA 150 

Qy 219 SENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFK 278 

h : :| I I II : h : --I Ul- - :: | : : 

Db 151 CSNFFIQAKSFKFLNELFTAQTNYETRSLWMAEPAFIKLWLiAIQSNKHAVRSRAVSILE 210 

Qy 279 VFVASPHKTQPIVEILLKNQPKLIEFL SS FQKERTDDEQFAD 320 

:h :| : : I : :h II I I :|| I hi 

Db 211 IFIRNPRNSPEVHEFIGRNRNVLIAFFFNSAPIHYYQGSPNEKE---DAQYARMAYKLLN 267 

Qy 321 ---EKNYLIKQIRDLKK 334 

:: : :|::| :: 
Db 268 WDMQRPFTQEQLQDFEE 284 



RESULT 9 

H64574 

DNA topoisomerase I - Helicobacter pylori (strain 26695) 
C; Species: Helicobacter pylori 

C;Date: 09-Aug-1997 #sequence_revision 09-Aug-1997 #text_change 08-Oct-1999 
C;Accession: H64574 

R;Tomb, J.F.; White, 0.; Kerlavage, A.R.; Clayton, R.A. ; Sutton, G.G. ; 
Fleischmann, R.D.; Ketchum, K.A. ; Klenk, H.P.; Gill, S.; Dougherty, B.A.; 
Nelson, K, ; Quackenbush, J.; Zhou, L. ; Kirkness, E.F.; Peterson, S.; Loftus, B. ; 
Richardson, D.; Dodson, R. ; Khalak, H.G. ; Glodek, A.; McKenney, K. ; Fitzegerald, 
L.M.; Lee, N.; Adams, M.D.; Hickey, E.K.; Berg, D.E.; Gocayne, J.D.; Utterback, 
T.R.; Peterson, J.D.; Kelley, J.M. ; Cotton, M.D.; Weidman, J.M.; Fujii, C. ; 
Bowman, C; Watthey, L, 
Nature 388, 539-547, 1997 

A;Authors: Wallin, E. ; Hayes, W.S.; Borodovsky, M. ; Karpk, P.D.; Smith, H,0.; 

Eraser, CM.; Venter, J.C. 

A; Title: The complete genome sequence of the gastric pathogen Helicobacter 
pylori , 

A;Reference number: A64520; MUID: 97394467 ; PMID: 9252185 
A; Access ion: H64574 

A;Status: preliminary; nucleic acid sequence not shown; translation not shown 
A;Molecule type: DNA 
A;Residues: 1-677 <TOM> 

A; Cross -references : GB:AE000559; GB:AE000511; NID:g2313536; PIDN : AAD07502 . 1 ; 
PID:g2313542; TIGR:HP0440 
C;Superfamily : DNA topoisomerase I 

Query Match 7.9%; Score 134.5; DB 2; Length 677; 

Best Local Similarity 21.6%; Pred. No. 0.19; 

Matches 88; Conservative 58; Mismatches 134; Indels 127; Gaps 16; 

Qy 7 FSKSHKNPA-EIVKILKDNL AILEKQDKK TDKASEEVSKSLQAMKE 51 

I II I : :| III I :: |: II I I : ||||: 

Db 222 FKFKDKNEASQFLKDLKDGLGSMSVLVSLKESLSNKEPKKPFTTSKLLSQASKSLKI--- 278 

Qy 52 ILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGT 111 

Ih :|lllhh :|h n : I h II 



Db 



279 



PTKEIAQLAQKLFEAGLITYHRTDSEFLSPEYLKEHEVFFEPIY 



322 



Qy 



112 RSPTV EYIS AHPHILFMLLKGYEAPQIALRCGIMLRECIRHE 153 

\-\ W ■■ III I I I :|: : I : I 

323 --PSVYQYREYKAGKNSQAEAHEAIRITHPHALKDLEKVCSDAKISEELALKLYQLIYTN 380 



Db 



Qy 



154 PL AKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIF 210 



Db 



••I'M- M I I • ♦ I M I • I I ♦ I 

381 TICSQSRNALY-NQYDCIFK IKSESFKLSFKLLKEKGFLEIEELIQGKEEIN 43 




Qy 



211 EDYEKLLQSENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQ 270 
= h : Ih I I h : : = | || : || = 

432 RE-EQESEIENFSLKENDSVPLKEVFIKK lEKPSPKPYKESAFIPLLESEG 481 



Db 



Qy 



271 FEAFHVFKVFVASPHKTQPIVEILLKNQ PKLIEFLSSFQKERTDD- 315 



Db 




: :| :| |:|:: | 
QGLEVISFFKKDKEVDF 531 



Qy 



316 EQF ADEKNYLIKQIRDLKKTA 336 



Db 




I II II 
I SKLKSTA 578 



RESULT 10 

H64709 

hypothetical protein HP1520 - Helicobacter pylori (strain 26695) 
C;Species: Helicobacter pylori 

C;Date: 09-Aug-1997 #sequence_revision 09--Aug-1997 #text_change 08-Oct-1999 

C; Access ion : H64 709 

R;Tomb, J.F,; White, O.; Kerlavage, A.R.; Clayton, R.A.; Sutton, G,G. ; 
Fleischmann, R.D.; Ketchum, K.A. ; Klenk, H.P.; Gill, S.; Dougherty, B.A. ; 
Nelson, K. ; Quackenbush, J.; Zhou, L. ; Kirkness, E.F.; Peterson, S.; Loftus, B.; 
Richardson, D. ; Dodson, R, ; Khalak, H.G.; Glodek, A.; McKenney, K. ; Fitzegerald, 
L.M.; Lee, N. ; Adams, M.D.; Hickey, E.K./ Berg, D.E./ Gocayne, J.D.; Utterback, 
T.R.; Peterson, J.D. ; Kelley, J.M. ; Cotton, M.D. ; Weidman, J.M.; Fujii, C; 
Bowman, C. ; Watthey, L. 
Nature 388, 539-547, 1997 

A;Authors: Wallin, E.; Hayes, W,S.; Borodovsky, M.; Karpk, P.D.; Smith, H.O.; 
Eraser, CM./ Venter, J.C. 

A; Title: The complete genome sequence of the gastric pathogen Helicobacter 
pylori . 

A; Reference number: A64520; MUID: 97394467 ; PMID:9252185 
A, -Access ion: H64709 

A;Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A; Residues: 1-430 <TOM> 

A; Cross-references: GB:AE000650; GB:AE000511; NID:g2314700; PIDN: AAD08565 . 1 ; 
PID:g2314705; TIGR:HP1520 

C;Superfamily: Helicobacter pylori hypothetical protein HP1520 

Query Match 7.5%; Score 128; DB 2; Length 430; 

Best Local Similarity 20.9%; Pred. No. 0.29; 

Matches 82; Conservative 73; Mismatches 135; Indels 102; Gaps 20; 

Qy 7 FSKSHKNPAEI VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 62 ' 

I : h II lllhhh h : :||: : I - j: 
Db 60 FYPNRKSKIEIEFNGEKILKENVAVFHSYDE>-EFSSEDSVTTFMAKSDL KQQY 111 



Qy 63 TEAVAQLAQELYSSGLLVTL--IA DLQLIDFEGKKDVTQIFNNILR 106 

^ U :| II :| II I I I :| :| I 

Db 112 DNILLELEKE--KKALLKSLRDIASGFDYEEEIKTIKNEKNKSFYEILDNHLTEIESSEK 169 

Qy 107 RQIGTRSPTV-EYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKII 159 

I I I I :::: | :: j: | |:: 

Db 170 HYSFKYRDIFDGSKKVKDFVNKHHDLIEQYFNKYQ ELLSQSK 211 

Qy 160 LF SNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQ 204 

Db 212 IFKHMNSGDFGTNHADDLKKALENNRFFKANHSLKIAGEEITNYQKL-SDIFENEKNRIL 270 

Qy 205 NYDTIFEDYEKLLQSENYVTKRQSLKLLGELI LDRHNF- >AIMTKYISKP 252 

n U ::|: | : : || : | jj :| :: |: : 

Db 271 NNEELKESFDKI EKVINANKELKAFJCDAISKDNTLLTEFLDYDSFRKKVLFSYLKQV 327 

Qy 253 -ENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKE 311 

:hl -II hi Ih : I : : -11 || h I I = 

Db 328 IQNVKSLVNLYREKKPEIE EIIKQASKDQKEWESVIEIF- -NQRFLVPFKVELQNQ 381 

Qy 312 R TDDEQ FADEKNYLIKQIRDLKK 334 

I I hh : I Ihl 

Db 382 KDILLNKDAAQFRFIFSDDNQDMNVQKEDLQK 413 

RESULT 11 
B71685 

hypothetical protein RP2 95 - Rickettsia prowazekii 
C; Species: Rickettsia prowazekii 

C;Date: 21-NOV-1998 #sequence_revision 21-Nov-1998 #text_change 03-Nov-2000 
C; Access ion: B71685 

R;Andersson, S,G.E.; Zomorodipour , A.; Andersson, J.O.; Sicheritz -Ponten, T, ; 
Alsmark, U.CM.; Podowski, R.M./ Naeslund, A.K.; Eriksson, A.S.; Winkler, H,H. ; 
Kurland, C.G. 
Nature 396, 133-140, 1998 

A; Title: The genome sequence of Rickettsia prowazekii and the origin of 
mitochondria, 

A; Reference number: A71630; MUID: 99039499 ; PMID:9823893 
A; Accession: B71685 

A;Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A; Residues: 1-298 <AND> 

A; Cross -references : GB:AJ235271; GB:AJ235269; NID : g3868717 ; PIDN: CAA14756 . 1 ; 

PID:g3860856; GSPDB : GN00081 

A /Experimental source: strain Madrid E 

C; Genetics : 

A; Gene: RP295 

Query Match 7.4%; Score 125,5; DB 2; Length 298; 

Best Local Similarity 20.1%; Pred. No. 0.27; 

Matches 62; Conservative 57; Mismatches 114; Indels 75; Gaps 13; 

Qy 73 LYSSGLLVTLIADLQLIDFEGKKDVTQ IFNNILRRQIGTRS 113 

h hll : ::|: : :|| : : I : I 

Db 6 LFIQLLIVTSLVKAEIIEVDSLNKITQDFKVNYNKNYLPQDLLWTVLDKFLFKSFGV-- 63 



Qy 


114 


Db 


64 


Qy 


167 


Db 


110 


Qy 


221 


Db 


167 


Ov 


278 


Db 


214 


Qy 


330 


Db 


274 



PTVEYISAHPHILFMLLKGY--EAPQIALRCGIMLRECIRHEPLAKIILFSNQFR 166 

I Ml I - I : : hi : :| - 

PIGEYI DQHRYLALAPLFSHI NKNPKI lY 1 TQLI LTNNSYKKELQE 109 



hi I I : : : :: IhhU 



Ih : I : : : I I h h-| I II 
NYI IFNNLDSFDNTYPVFYKGILTSNNI PASKVILNFL IQINFI PKC 213 



-I ^ Mil I |: I :||: : | | || || 

:lisssrellrsmefqlnnyssnilfigyhynnksisddkdykdiayytkmindlipqi 

)LKKTAP 337 

Ih I 



RESULT 12 
T08880 

NMDA receptor-binding protein yotiao - human 

C; Species: Homo sapiens (man) 

C;Date: ll-Jun-1999 #sequence_revision ll-Jun-1999 #text_change 21-Jul-2000 
C; Access ion: T08880 

R;Lin, J.W. ; Wyszynski, M. ; Madhavan, R. ; Sealock, R. ; Kim, J.U. ; Sheng, M. 

J. Neurosci. 18, 2017-2027, 1998 

A;Title: Yotiao, a novel protein of neuromuscular junction and brain that 
interacts with specific splice variants of NMDA receptor subunit NRl. 
A /Reference number: Z16511; MUID: 98151389; PMID:9482789 

A; Access ion: T08880 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A /Molecule type: mRNA 
A;Residues: 1-1642 <LIN> 

A; Cross -references : EMBL : AF026245 ; NID: g2623067 ; PIDN : AAB86384 . 1 ; PID:g2623068 
C; Genetics : 

A;Map position: 7q21-22 

C; Keywords: brain; cerebral cortex; coiled coil; neuromuscular junction; 
skeletal muscle 

Query Match 7.4%; Score 125.5; DB 2; Length 1642; 

Best Local Similarity 20.2%; Pred. No. 2.4; 

Matches 77; Conservative 73; Mismatches 117; Indels 115; Gaps 15; 

Qy 18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

- Illll I M : I hh h Ih II I 
Db 664 IEKLKDNLGIHYKQ--QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 710 

Qy 78 LLVTLIADLQ--LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

- • III h: : III h-l II h : 

Db 711 --ISKLKDLQQSLVNSKSEEMTLQI--NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 766 

Qy 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFA 185 

1=11 I - I :| I I 

Db 767 LEKQMKEKE NDLQEKFAQLEAEN-SILKDEKK 797 



Qy 


186 


Db 


798 


Qy 


238 


Db 


858 


Qy 


265 


Db 


918 


Qy 


321 


Db 


975 



M H I : : : | : : : : | : : : | : : | | | | : | : : | | : ' 

'LEDMLKI HTPVSQEERLI FLDS I KSKSKDSVWEKEI EI LI EENEDLKQQCI QLNEEI E] 

IRHNFAIMTK YISKPENLKLMMNLLR] 

h |: I I II : I ::| | 

irntfsfaeknfevnyqelqeeyacllkvkddledsknkqeleyksklkalneelhlqr: 
:spniqfea--fhvfkvfvasphktqpiveillknqpklieflssfqkertd-deqfad 

- - I I III :| : h |: :|:| j -h : =: :| 
[PTTVKMKSSVFDEDKTFVA ETLEMGEVXEJCDTTELMEKLEVTKREKLELSQRLSDJ 

EKNYLIKQIRDLK 333 

I ::| :::: II 



RESULT 13 
B72420 

hypothetical protein TM0088 - Thermotoga maritima (strain MSB8) 
C;Species: Thermotoga maritima 

C;Date: ll-Jun-1999 #sequence_revision ll-Jun-1999 #text_change 21-Jul-2000 
C;Accession: B72420 

R;Nelson, K.E.; Clayton, R.A.; Gill, S.R.; Gwinn, M.L.; Dodson, R.J.; Haft, 
D.H.; Hickey, E.K.; Peterson, J.D.; Nelson, W.C.; Ketchum, K.A. ; McDonald, L. ; 
Utterback, T.R.; Malek, J. A,; Linher, K.D,; Garrett, M.M.; Stewart, A.M. ; 
Cotton, M.D.; Pratt, M.S.; Phillips, C.A. ; Richardson, D. ; Heidelberg, J. ; 
Sutton, G.G.; Fleischmann, R.D.; White, 0.; Salzberg, S.L.; Smith, H.O.; Venter, 
J.C. ; Fraser, CM. 
Nature 399, 323-329, 1999 

A;Title: Evidence for lateral gene transfer between Archaea and Bacteria from 
genome sequence of Thermotoga maritima. 

A; Reference number: A72200; MUID: 99287316; PMID: 10360571 

A /Access ion: B7242 0 

A; Status : preliminary 

A; Molecule type: DNA 

A; Residues: 1-1285 <ARN> 

A; Cross-references: GB : AE001695 ; GB : AE000512 ; NID:g4980569; PIDN:AAD35182 . 1; 

PID:g4980577; TIGR:TM0088 

A; Experimental source: strain MSB8 

C; Genetics : 

A;Gene: TM0088 



Query Match 7.2%; Score 123.5; DB 2; Length 1285; 

Best Local Similarity 21.5%; Pred. No. 2.4; 

Matches 86; Conservative 78; Mismatches 129; Indels 107; Gaps 23; 

Qy 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQD KKT DKASEEV SKS 45 

_ = I 11=1 h : I I : : : | | | | || j || 

Db 556 LKVAMLSGKEEEN VQKAAEELQI ISSEERI IRFVKKTENVPIDKAKNWLQLYSVS 611 

Qy 4 6 LQAMKEILCGTNEKEPPTEAVAQLAQELYSSGL LVTLIAD-- 85 

- : I hi I I I |:::|| : |: :: : 

Db 612 lEELGNELWIGERE-EVEKAADLLQKIFSSEVEISRDFVKLPSWIDEQEKLLEWKNSA 670 

Qy 86 ---LQLID FEGKKD VTQIFNNILRRQIG--TRSPTVEYI -- -SAHPHILFML 129 



Db 


671 


Qy 


130 


Db 


730 


Qy 


177 


Db 


781 


Qy 


234 


Db 


837 


Qy 


284 


Db 


893 



:-| III h ::|::|: : :| : |||:: h | h 

GITYEILDGWYFEGTKENVEKAKELFSDIVEK-LGEVRKEETVEFLEVNSSFPVDEFIN 729 

LKGYEAPQIALRCGIMLRECIRHEPLAKIIL FSNQFRDFF KYVELST 176 

III: I : I ::| h :||| Ih : 

LSGKLYPDVT CFSLDQLGLLVLKGSSEAVEDLSSMYRSFFERHQKIVKENV 780 

FD lASDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLG 233 

II : ■■ -l-- I :|ll : : - I || | |:: : :| | 

FDRLMLEVPSGFSFEEFKTFLEVLVPEVKQ WYLDKLNLLLVEVPVSQSERVKSLL 836 

ELI LDRHNFAI MTKY IS KPENL - KLMMNLLRDKS PNI QFEAF - HVFKVFVAS 283 

: n I : h I : III |:: I :: :: I 

DTFLKKEEAVSEKKAVKSVTI PSGVNPDELSSYLKKLLR NVEITVFPNMGQMI VEG 8 92 

P - HKTQP I VE I LLKNQPKLI EFLSSFQKERTDDEQFADEK 322 
I - Ih: : 1- III I : :| I 



RESULT 14 

F64489 

hypothetical protein MJ1519 - Methanococcus jannaschii 
C;Species: Methanococcus jannaschii 

C;Date: 13-Sep-1996 #sequence_r avis ion 13-Sep-1996 #text_change 21-Jul-2000 
C; Access ion: F644 89 

R;Bult, C.J.; White, O.; Olsen, G.J.; Zhou, L. ; Fleischmann, R.D.; Sutton, G.G. 
Blake, J.A. ; FitzGerald, L.M.; Clayton, R.A, ; Gocayne, J.D,; Kerlavage, A.R.; 
Dougherty, B.A.; Tomb, J.F.; Adams, M.D.; Reich, C.I.; Overbeek, R.; Kirkness, 
E.F.; Weinstock, K.G. ; Merrick, J,M. ; Glodek, A.; Scott, J.L. ; Geoghagen, 
N.S.M.; Weidman, J.F,; Fuhrmann, J.L. ; Nguyen, D. ; Utterback, T.R.; Kelley, 
J.M.; Peterson, J.D.; Sadow, P.W.; Hanna, M.C.; Cotton, M.D.; Roberts, K,M.; 
Hurst, M,A. 

Science 273, 1058-1073, 1996 

A;Authors: Kaine, B.P.; Borodovsky, M. ; Klenk, H.P.; Fraser, CM.; Smith, H,0, ; 
Woese, C.R.; Venter, J.C. 

A; Title: Complete genome sequence of the methanogenic archaeon, Methanococcus 
jannaschii . 

A;Reference number: A64300; MUID: 96337999 ; PMID: 8688087 
A;Accession: F64489 

A;Status: preliminary; nucleic acid sequence not shown; translation not shown 
A; Molecule type: DNA 
A;ResidueS: 1-1175 <BUL> 

A; Cross -references : GB:U67593; GB:L77117; NID: g2826427 ; PIDN:AAB99538 . 1 ; 

PID:gl500409; TIGR:MJ1519 
C;Genetics : 

A;Map position: FOR1494096-1497623 

Query Match 7.0%; Score 120; DB 2; Length 1175; 

Best Local Similarity 21.5%; Pred. No. 3.6; 

Matches 76; Conservative 58; Mismatches 131; Indels 88; Gaps 15; 

Qy V FSKSHKNPAEIVKILKD-NLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEA 65 

hi : : II I hi II h :| : I : h M 

Db 232 FNKFREENQDFDKYLTDENIAFRPHVMKKFDEFAENIKKVIAELE GSKYKYPGLPG 287 



Qy 66 VAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHI 125 



Db 



288 V LYFLGMEDAYSRYIELWKNEGEKGEEKLYNALI-ESLENRKENLEF 333 



Qy 


126 


Db 


334 


Qy 


181 


Db 


371 


Qy 


230 


Db 


425 


Qy 


275 


Db 


478 



I ^ : I I :||:| I I III I : 

- GI TKKVI DKF I AQKEEFREFLKN YAVY YELSAFKLEK 370 



- SDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSL 229 

I :|l I |::| : ■ \ : : : I :|: | 



II I II II 11 II :|| : I |: I 

IPYRVRALLVE-ILKRHLSSGNTTISTK DLKDFFEKMDKDIVKITFDEP] 

TKVFVASPHKTQPI VEI LLKNQPKLI EFLSSFQKERTDDEQFADEKNYLI K 3 2 ' 
■■\ ■■■ I : : : : h I I I : I :| : |||:| 



RESULT 15 

T00246 

DNA polymerase V - fission yeast (Schizosaccharomyces pombe) 
C; Species: Schizosaccharomyces pombe 

C;Date: Ol-Feb-1999 #sequence__revision Ol-Feb-1999 #text_change 31-Jan-2000 
C;Accession: T00246; T39442 
R;Shimizu, K. 

submitted to the EMBL Data Library, March 1998 

A; Description: S. pombe homolog of S.cerevisiae DNA polymerase V, 
A; Reference number: Z1412 9 
A; Access ion: TO 024 6 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A /Molecule type: mRNA 
A;Residues: 1-959 <SHI> 

A; Cross -references: EMBL:AB012696 ; NID: dl224325 ; PIDN: BAA32 046 . 1 ; PID:dl033008 
R;Xiang, Z.; Aves, S.; Lyne, M. ; Rajandream, M.A.; Barrell, B.C.; Volckaert, G. 
submitted to the EMBL Data Library, March 1998 
A; Reference number: Z21854 
A; Accession: T3 9442 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;ResidueS: 1-959 <LYN> 

A; Cross -references: EMBL:AL022305 ; PIDN: CAA18436 . 1; GSPDB : GN00067 ; 

SPDB:SPBC14C8 . 14c 

A; Experimental source: strain 972h-; cosmid cl4C8 
C;Genetics : 

A;Gene: pol5+; SPBC14C8.14C 
A; Map position: 2 
A; Introns : 66/3 

Query Match 7.0%; Score 118.5; DB 2; Length 959; 

Best Local Similarity 20.5%; Pred. No. 3.5; 

Matches 80; Conservative 63; Mismatches 135; Indels 113; Gaps 19; 

Qy 9 KSHKN PAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 62 

II II :-h: 1 II II - II 
Db 522 KSPKNNLLISMDESVIEIVQKSLSVLHKVTKKIDKKAQHL-QQLNAF 567 



Qy 


63 


Db 


568 


Qy 


120 


Db 


620 


Qy 


151 


Db 


680 


Qy 


211 


Db 


731 


Qy 


263 


Db 


781 


Qy 


304 


Db 


837 



i III II I II I :::|| :|: : II I 

- QLLYSLVLLQVYAGDTDS I DVLED I DNCYSKVFNKKS KRESTSNEPTAME I L 6 1 9 



-SAHPHILF MLLKGY EAPQIALRC GIMLRECI 150 

■■ \ -l W ■■ III h I : 



I h := : II :::: = |h I 



EDYEKLLQ SENYVTKRQSLKL LGELILDRHNFAIMTKYI SKPENLKLMMNLL 262 

II : I = I I I I h :| 11=1 



)KSPNIQFEAFHVFKV- -FVASPHKTQ PIVEILLKNQPKLIE 303 

I =lh II : : :||l l-l-ll = |::| 



I : I : :|: || | | 



Search completed: January 7, 2004, 16:46:10 
Job time : 32 sees 



GenCore version 5,1,6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 



January. 7, 2004, 16:44:17 ; Search time 17 Seconds 

(without alignments) 
932.235 Million cell updates/sec 

US-10-088-872-2 
1704 

1 MKKMPLFSKSHKNPAEIVKI FADEKNYLIKQIRDLKKTAP 337 



Scoring table: 



BLOSUM62 

Gapop 10,0 , Gapext 0.5 



Searched: 127863 seqs, 47026705 residues 

Total number of hits satisfying chosen parameters: 127863 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post -processing: 



Minimum Match 
Maximum Match 
Listing first 



0% 

100% 

45 summaries 



Database : SwissProt_41 : * 

Pred. No. is the number of results predicted by chance to have a 

score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


1685 


98 


,9 


334 


1 


M02L_HUMAN^ 


Q9h9s4 


homo sap i en 


2 


1669 


97 


.9 


334 


1 


M02L_M0USE 


Q9dbl6 


mus musculu 


3 


1381 


81 


. 0 


341 


1 


M025_HUMAN 


Q9y376 


homo sapien 


4 


1376 


80 


. 8 


341 


1 


M02 5_MOUSE 


Q06138 


mus musculu 


5 


1111 


65 


.2 


339 


1 


M025_DROME 


P91891 


drosophila 


6 


1006,5 


59 


. 1 


338 


1 


M02M_CAEEL 


018211 


caenorhabdi 


7 


834.5 


49 


0 


329 


1 


YFV6__SCHP0 


Q9p7q8 


schizosacch 


8 


776 


45 


5 


321 


1 


DE76_CHLPR 


Q9xfy6 


chlorella p 


9 


728 


42 


7 


343 


1 


M02N_ARATH 


Q9fgk3 


arabidopsis 


10 


716.5 


42 


0 


343 


1 


M02M_ARATH 


Q9m0m4 


arabidopsis 


11 


666 


39 


1 


384 


1 


HYMA_EMENI 


060032 


emericella 


12 


632 


37 


1 


348 


1 


M02L_ARATH 


Q9zq77 


arabidopsis 


13 


485 


28 


5 


399 


1 


HYM1_YEAST 


P32464 


saccharomyc 


14 


143.5 


8 


4 


339 


1 


M02L_CAEEL 


Q9tzm2 


caenorhabdi 


15 


128.5 


7 


5 


3911 


1 


AKA9_HUMAN 


Q99996 


h a-kinase 


16 


125.5 


7. 


4 


298 


1 


Y295 RICPR 


Q9zdn2 


rickettsia 


17 


118,5 


7. 


0 


959 


1 


DP05_SCHP0 


060094 


schizosacch 



18 


116 . 5 


6 


8 


724 


1 


HMMR 


HUMAN 


075330 


homo sspien 


19 


115 


6 


7 


474 


1 


gshb' 


MOUSE 


P51855 


mus tnusculu 


20 


112.5 


6 


6 


1411 


1 


YM42~ 


YEAST 


Q03214 


sac cha rotny c 


21 


109.5 


6 


4 


978 


1 


RA50" 


AQUAE 


067124 


aquifex aeo 


22 


109 


6 


4 


695 


1 


YCX7 


CHLVU 


020159 


chlorella v 


23 


109 


6 


4 


1401 


1 


LATA 


LATMA 


P23631 


latrodectus 


24 


108 .5 


6 


4 


586 


1 


2A5d" 


"rabit 


Q28653 


0 serine/th 


25 


108.5 


6 


4 


602 


1 


2A5D 


HUMAN 


Q14738 


h serine/th 


26 


108 .5 


6 


4 


1075 


1 


YI24" 


metja 


057588 


methanococc 


27 


108 


6 


3 


568 


1 


DNAB~ 


PORPU 


P51333 


porphyira pu 


28 


107 . 5 


6 


3 


483 


1 


ACPA 


bacan 


Q44643 


bacillus an 


29 


107.5 


6 


3 


1042 


1 


tirh' 


"metja 


060295 


methanococc 


30 


107.5 


6 


3 


1726 


1 


MSPl 


"PLAFC 


P04934 


plastnodiun\ 


31 


107.5 


6 


3 


1726 


1 


MSPl 


PLAFP 


P504 95 


"dI <=isTnodiiim 


32 


107 


6 


3 


1727 


1 


ALMl' 


SCHPO 


Q9utk5 


schizosacch 


33 


106 


6 


2 


474 


1 


gshb" 


HUMAN 


P48637 


hoTTin '^ a'ni pn 

1,^ J- X 


34 


105 . 5 


6 


2 


793 


1 


REGA~ 


"dicdi 


Q23917 


dictyostel i 


35 


105.5 


6 


2 


847 


1 


RSG2' 


rat 


Q63713 


rattus norv 


36 


104 .5 


6 


1 


1701 


1 


MSPl 


PLAFF 


P13819 


■DlastnodiuTn 


37 


104 .5 


6 


1 


1701 


1 


MSPl" 


PLAFM 


P08569 


■Dlasmodiiim 


38 


104 


6 


1 


859 


1 


MUTS' 


AQUAE 


066652 


aquifex aeo 


39 


104 


6 


1 


1290 


1 


RA50_ 


_SCHPO 


Q9utj8 


schizosacch 


40 


104 


6 


1 


1682 


1 


MSPl" 


"PLAF3 


P19598 


Plasmodium 


41 


103.5 


6 


1 


641 


1 


PRIM_ 


UREPA 


Q9ppz6 


ureaplasma 


42 


103 


6 


0 


2663 


1 


gene' 


[human 


Q02224 


homo sapien 


43 


102.5 


6 


0 


502 


1 


URIC_^ 


[bacsb 


Q45697 


bacillus sp 


44 


102.5 


6 


0 


975 


1 


kinh] 


[drome 


P17210 


drosophila 


45 


102.5 


6 


0 


1202 


1 


RPM2" 


[yeast 


Q02773 


saccharomyc 



ALIGNMENTS 



RESULT 1 
M02L_HUMAN 

ID M02L_HUMAN STANDARD; PRT; 334 AA. 

AC Q9H9S4; Q9BZ33 ; 

DT 16~OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE M025-like protein. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606 ; 

RN [1] 

RP SEQUENCE OF 4-334 FROM N.A. 

RA Isogai T. , Ota T. , Hayashi K. , Sugiyama T. , Otsuki T. , Suzuki Y., 

RA Nishikawa T. , Nagai K. , Sugano S., Shiratori A., Sudo H., 

RA Wagatsuma M. , Hosoiri T., Kaku Y., Kodaira H. , Kondo H. , Sugawara M. , 

RA Takahashi M., Chiba Y. , Ishida S., Murakawa K. , Ono Y. , Takiguchi S., 

RA Watanabe S., Kimura K. , Murakami K. , Ishii S., Kawai Y. , Saito K. , 

RA Yamamoto J., Wakamatsu A., Nakamura Y., Nagahari K. , Masuho Y. , 

RA Ninomiya K. , Iwayanagi T. ; 

RT "NEDO human cDNA sequencing project."; 

RL Submitted (AUG-2000) to the EMBL/GenBank/DDBJ databases. 

RN [2] 



RP SEQUENCE OF 276-334 FROM N.A. 

RA Pearce A. ; 

RL Submitted (JAN-2001) to the EMBL/GenBank/DDBJ databases. 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

cc 

CC This SWISS-PROT entry is copyright- It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www-isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; AK022639; BAB14147.1; ALT_INIT. 

DR EMBL; AL138875; CAC28084.1; 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF032 04; Mo2 5; 1. 

SQ SEQUENCE 334 AA; 38728 MW; 97702273D8548432 CRC64; 



Query Match 98.9%; Score 168 5; DB 1; Length 334; 

Best Local Similarity 99.7%; Pred. No. 1.3e-100; 

Matches 333; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


4 


MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 


63 


Db 


1 


IIIIIIIIIIIIIIIIIIIMMIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIII 

MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 


60 


Qy 


64 


EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHP 

IIIMIIMIIIIIIMIIIIIIIIIillllllllMIIIIIIIIIIIIIIIIIIIIIII 

EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHP 


123 


Db 


61 


120 


Qy 


124 


HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 

illlllllllllllllllllllllllllllll lllllllllllllllllllllllllll 


183 


Db 


121 


HILFMLLKGYEAPQIALRCGIMLRECIRHEPLVKIILFSNQFRDFFKYVELSTFDIASDA 


180 


Qy 


184 


FATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFA 

illlllllllllllllllllllllllllllllllllllMMIIIIIIIIIIilllllll 

FATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI LDRHNFA 


243 


Db 


181 


240 


Qy 


244 


IMTKYI SKPENLKLMMNLLRDKSPNI QFEAFHVFKVFVASPHKTQPI VEI LLKNQPKLI E 

IMIIIIIIIIIIIIMIIIIIIMIMIIIIIMMIIIIIIIIIIIIMIIIIMMI 


303 


Db 


241 


I MTKYI SKPENLKLMMNLLRDKS PNI QFEAFHVFKVFVASPHKTQP I VE I LLKNQPKLI E 


300 


Qy 


304 


FLSSFQKERTDDEQFADEKNYLI KQI RDLKKTAP 337 

llllllllllllllllllllllllllllllllll 
FLSSFQKERTDDEQFADEKNYLI KQI RDLKKTAP 334 




Db 


301 





RESULT 2 
M02L_M0USE 

ID M02L_M0USE STANDARD; PRT; 334 AA. 

AC Q9DB16; Q8BG52; Q91WB8; Q91YL0; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 15-SEP-2003 {Rel. 42, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE M025-like protein. 

OS Mus mus cuius (Mouse) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 

OX NCBI_TaxID-10090; 

RN [1] 

RP SEQUENCE FROM N.A, (ISOFORMS 1 AND 2) . 

RC STRAIN=C57BL/6J; 

RC TISSUE=Cerebellum, Eye, Pituitary, and Testis; 

RX MEDLINE=22354683; PubMed=12466851 ; 

RA Okazaki Y. , Furuno M. , Kasukawa T. , Adachi J. , Bono H. , Kondo S. , 

RA Nikaido I., Osato N,, Saito R., Suzuki H., Yamanaka I., Kiyosawa H. , 

RA Yagi K. , Tomaru Y., Hasegawa Y. , Nogami A., Schonbach C. , Gojobori T. , 

RA Baldarelli R. , Hill D.P., Bult C. , Hume D.A. , Quackenbush J., 

RA Schriml L.M. , Kanapin A., Matsuda H. , Batalov S., Beisel K.W. , 

RA Blake J,A, , Bradt D. , Brusic V., Chothia C. , Corbani L.E., Cousins S., 

RA Dalla E., Dragani T.A. , Fletcher C.F., Forrest A., Frazer K.S., 

RA Gaasterland T. , Gariboldi M., Gissi C, Godzik A., Gough J., 

RA Grimmond S., Gustincich S., Hirokawa N. , Jackson I.J., Jarvis E.D., 

RA Kanai A., Kawa j i H. , Kawasawa Y. , Kedzierski R.M. , King B.L,, 

RA Konagaya A., Kurochkin I.V., Lee Y. , Lenhard B., Lyons P. A., 

RA Maglott D.R., Maltais L,, Marchionni L. , McKenzie L. , Miki H. , 

RA Nagashima T. , Numata K. , Okido T. , Pavan W.J., Pertea G. , Pesole G., 

RA Petrovsky N., Pillai R. , Pontius J.U., Qi D., Ramachandran S., 

RA Ravasi T. , Reed J.C, Reed D.J., Reid J., Ring B.Z., Ringwald M. , 

RA Sandelin A., Schneider C. , Semple C.A, , Setou M. , Shimada K. , 

RA Sultana R. , Takenaka Y., Taylor M.S., Teasdale R.D., Tomita M. , 

RA Verardo R. , Wagner L. , Wahlestedt C. , Wang Y., Watanabe Y., Wells C. , 

RA Wilming L.G., Wynshaw-Boris A., Yanagisawa M. , Yang I., Yang L. , 

RA Yuan Z., Zavolan M, , Zhu Y., Zimmer A., Carninci P., Hayatsu N., 

RA Hirozane-Kishikawa T. , Konno H. , Nakamura M., Sakazume N. , Sato K. , 

RA Shiraki T. , Waki K. , Kawai J., Aizawa K. , Arakawa T., Fukuda S., 

RA Hara A., Hashizume W., Imotani K. , Ishii Y. , Itoh M. , Kagawa I., 

RA Miyazaki A., Sakai K. , Sasaki D. , Shibata K. , Shinagawa A., 

RA Yasunishi A., Yoshino M. , Waterston R., Lander E.S., Rogers J., 

RA Birney E., Hayashizaki Y. ; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM 1) . 

RC STRAIN=FVB/N; TISSUE=Mammary gland, and Salivary gland; 

RX MEDLINE=22388257; PubMed=124 77932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H. , Moore T. , Max S.I., Wang J., Hsieh F., 

RA Diatchenko L. , Marusina K. , Farmer A.A. , Rubin G.M. , Hong L. , 

RA Stapleton M., Soares M.B., Bonaldo M.F,, Casavant T.L,, Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C. , 

RA Raha S.S., Loquellano N.A. , Peters G. J. , Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J.A. , Gunaratne P.H., 

RA Richards S. , Worley K.C., Hale S-, Garcia A.M., Gay L. J. , Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E. , Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A,, Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W. , Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N, , Krzywinski M.I., Skalska U. , Smailus D.E., 



RA Schnerch A., Schein J.E., Jones Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

CC - ! - ALTERNATIVE PRODUCTS : 

CC Event =Alt amative splicing; Named isoforms=2; 

CC Name=l; 

CC IsoId=Q9DB16-l; Sequence=Displayed; 

CC Name=2 ; 

CC IsoId=Q9DB16-2; Sequence=VSP_0074 17 , VSP_007418; 

CC Note=No experimental confirmation available; 

CC -!- SIMILARITY: Belongs to the M025 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; AK005323; BAB23953,2; ALT_INIT. 

DR EMBL; AK030474; BAC26978-1; ALT_INIT. 

DR EMBL; AK053642; BAC35457.1; ALT_INIT. 

DR EMBL; AK076758; BAC36470.1; ALT_INIT. 

DR EMBL; AK076867; BAC36513.1; -. 

DR EMBL; BC016128; AAH16128.1; 

DR EMBL; BC016546; AAH16546.1; -. 

DR MGD; MGI: 1916258; 1500031K13Rik . 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Alternative splicing. 

FT VARSPLIC 276 293 VFVASPHKTQPI VEILLK -> NSVFITNRIHGLKRWLSS 

FT (in isoform 2) . 

FT /FTId=VSP_007417. 

FT VARSPLIC 294 334 Missing (in isoform 2) . 

FT /FTId=VSP_007418 . 

FT CONFLICT 42 42 S -> P (IN REF . 1; BAB23953) . 

FT CONFLICT 229 229 L -> R (IN REF. 2; AAH16546) . 

SQ SEQUENCE 334 AA; 38718 MW; 822F04A87FB4EB6F CRC64 ; 

Query Match 97.9%; Score 1669; DB 1; Length 334; 

Best Local Similarity 98.5%; Pred. No. 1.4e-99; 

Matches 329; Conservative 2; Mismatches 3; Indels 0; Gaps 0; 

Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

llllllllllllllllllllllllllllllllllllllllllllillllllllhlllll 

Db 1 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNDKEPPT 60 

Qy 64 EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHP 123 

llillllllllllllllllllllllllllllllllllllllllllllll llllil|:|| 
Db 61 EAVAQLAQELYSSGLLVTLIADLQLI DFEGKKDVTQI FNNI LRRQIGTRCPTVEYI SSHP 120 

Qy 124 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 183 

iillllillillllllllllllllllllllllllllllllllllllllllllllllllll 

Db 121 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 180 



Qy 184 FATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFA 243 

lllllllllllllllllllllllllllllilllllllllliiiiiiiiiiiiillllll 

Db 181 FATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFT 240 

Qy 244 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 303 

IIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIII 

Db 241 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 300 

Qy 304 FLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

lllllllllllllllllllllllllllllll II 
Db 301 FLSSFQKERTDDEQFADEKNYLIKQIRDLKKAAP 334 

RESULT 3 
M025_HUMAN 

ID M025_HUMAN STANDARD; PRT; 341 AA. 

AC Q9Y376; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE M025 protein (CGI-66) . 

GN M02 5. 

OS Homo sapiens (Human) . 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=20272150; PubMed=l 08 10093 ; 

RA Lai C.-H., Chou C.-Y,, Ch'ang L.-Y., Liu C.-S., Lin W.-C; 

RT "Identification of novel human genes evolutionarily conserved in 

RT Caenorhabditis elegans by comparative proteomics . " ; 

RL Genome Res. 10:703-713(2000). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE^Hypothalamus ; 

RA Jin W., Shi J., Ren S., Gu J., Fu S . , Huang Q. , Dong H. , Yu Y., Fu G. , 

RA Wang Y . , Chen Z . , Han Z . ; 

RT "A novel gene expressed in the human hypothalamus . " ; 

RL Submitted (DEC-1998) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Duodenum; 

RX MEDLINE-22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D. , Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K.,' 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F. , 

RA Diatchenko L. , Marusina K. , Farmer A.A. , Rubin G.M., Hong L. , 

RA Stapleton M., Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E,, 

RA Brownstein M. J. , Usdin T.B., Toshiyuki S., Carninci P., Prange C, , 

RA Raha S.S., Loquellano N,A. , Peters G.J., Abramson R,D,, Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J.A. , Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J,, Hulyk S.W., 

RA Villalon D.K., Muzny D.M. , Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 



RA Blakesley R.W. , Touchman J.W. , Green E.D,, Dickson M.C., 

RA. Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U. , Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. / 

RT "Generation and initial analysis of more than 15,000 full-length 

RT human and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC 

DR EMBL; AF151824; AAD34061.1; 

DR EMBL; AF113536; AAF14873.1; -. 

DR EMBL; BC020570; AAH20570.1; -. 

DR InterPro; I PRO 04 8 92; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

SQ SEQUENCE 341 AA; 39869 MW; EC710A528B6F9811 CRC64 ; 

Query Match 81.0%; Score 13 81; DB 1; Length 341; 

Best Local Similarity 81.0%; Pred. No. 3.1e-81; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps 2; 

Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I lllhlhlll l|:::hllll III Hhllllhl llllll Mill 
Db 1 MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II lllllllllllhllll IhlllllllllllMI lllllllllllllhllllll 

Db 61 EPQTEAVAQLAQELYNSGLLSTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 120 

Qy 12 0 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

UllllllllhhIII IIIIIIIIIIIIIIIMIhl II llhllhlllli 
Db 121 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 180 

Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

lllllllllllllllhi |:||l|:|| I :|IMI IIIIIIIIMIIIIIihIII 
Db 181 ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

III lllllllllllllllllllllll IIIIIIIIIIMII|:|:|||||::|||III 

Db 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQA 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

IIMIII II :|hllll IN Ihllllllh I 
Db 301 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRDLKRPA 337 

RESULT 4 
M025_MOUSE 

ID M025__M0USE STANDARD; PRT; 341 AA. 

AC Q06138; 



DT Ol-FEB-1994 (Rel . 28, Created) 

DT Ol-FEB-1994 (Rel. 28, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE M025 protein. 

GN M025 OR CAB3 9. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=93119656; PubMed-84 18 8 09 ; 

RA Miyamoto H. , Matsushiro A., Nozaki M. ; 

RT "Molecular cloning of a novel mRNA sequence expressed in cleavage 

RT stage mouse embryos . " ; 

RL Mol. Reprod. Dev. 34:1-7(1993). 

CC -!- FUNCTION: ONE OF THE FIRST GENES TO BE TRANSCRIBED DURING MOUSE 

CC DEVELOPMENT, MAY PLAY SOME GENERAL FUNCTION. 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic (Potential). 

CC -!- DEVELOPMENTAL STAGE: TRANSCRIBED DURING EARLY MOUSE DEVELOPMENT. 

CC DETECTED AT ALL DEVELOPMENTAL STAGES FROM THE EGG THROUGH THE 

CC BLASTOCYT, MOST ABUNDANT AT THE 2 -CELL STAGE, 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; S51858; AAB24801.1; 

DR MGD; MGI: 107438; Cab39. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

SQ SEQUENCE 341 AA; 39842 MW; E7F668529D6FE811 CRC64 ; 



Query Match 80.8%; Score 1376; DB 1; Length 341; 

Best Local Similarity 80.7%; Pred. No. 6.5e-81; 

Matches 272; Conservative 32; Mismatches 29; Indels 4; Gaps 2; 

Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I lilhlhlll lh::hllll III :|hllllhl llllll lllll 
Db 1 MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II lllllllllllhllll Ihllllllllilllll lllllilllllllhllllll 
Db 61 EPQTEAVAQLAQELYNSGLLGTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 120 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

HllllllllhhIII llllllllllllllllllhl II llhllhlllll 
Db 121 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 18 0 

Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

Illllllllllllllhl hlllhll I :|llll llllllllllllllllhlll 
Db 181 ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240 



Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

ill Mlllllllllllllllllllll IIIIIIIIIMIIIhhIlllhHIIill 

Db 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQT 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

lllllll II :|hllll III Ihlllhll: I 
Db 301 KlilEFLSKFQNDRTEDEQFNDEKTYLVKQI RNLKRAA 337 



RESULT 5 
M025_DROME 

ID M025_DROME STANDARD; PRT; 339 AA. 

AC P91891; Q9W85; 

DT 16-OCT-2001 (Rel, 40, Created) 

DT 16-OCT-2001 (Rel . 40, Last sequence update) 

DT 16-OCT-2001 (Rel . 40, Last annotation update) 

DE M025 protein {dMo25) . 

GN iyi025 OR CG4 083. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha ; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=96268479; Pubiyied=8672247 ; 

RA Nozaki M,, Onishi Y. , Togashi S., Miyamoto H. ; 

RT "Molecular characterization of the Drosophila Mo25 gene, which is 

RT conserved among Drosophila, mouse, and yeast."; 

RL DNA Cell Biol. 15:505-509(1996). 

RM [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Berkeley; 

RX MEDLINE=20196006; PubMed=10731132 ; 

RA Adams M.D., Celniker S.E., Holt R.A. , Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A. , Lewis S.E., Richards S,, Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X., 

RA Brandon R.C, Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C. , Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C. , Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D. , Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H. , Cadieu E., Center A., Chandra I., 

RA Cherry J.M. , Cawley S., Dahlke C. , Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K. J. , Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W. , 

RA Fosler C. , Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F., Gorrell J,H,, Gu Z., Guan P., Harris M., 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R. , Houck J., 

RA Host in D., Houston K.A. , Rowland T.J., Wei M.-H., Ibegwam C. , 

RA Jalali M. , Kalush F., Karpen G.H., Ke Z., Kennison J.A. , Ketchum K.A. , 

RA Kimmel B.E,, Kodira CD., Kraft C. , Kravitz S., Kulp D. , Lai Z., 

RA Lasko P., Lei Y. , Levi t sky A. A. , Li J., Li Z., Liang Y. , Lin X., 



RA Liu X., Mattel B., Mcintosh McLeod McPherson D. , 

RA Merkulov G. , Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L. , Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A. , Nixon K. , Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K. , Remington K. , Saunders R.D.C., Scheeler F., Shen H. , 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T. , 

RA Spier E., Spradling A.C., Stapleton M., Strong R., Sun E., 

RA Svirskas R. , Tector C. , Turner R., Venter E,, Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M. , Weissenbach J., 

RA Williams S.M., Woodage T. , Worley K.C., Wu D. , Yang S., Yao Q.A. , 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G. , Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N. , Zhong W. , Zhou X., Zhu S., Zhu X., Smith H.O. , 

RA Gibbs R.A., Myers E.W. , Rubin G.M. , Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000). 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; AB000402; BAA19098.1; -. 

DR EMBL; AE003526; AAF49432.1; 

DR FlyBase; FBgn0017572; Mo25. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF032 04; Mo25; 1. 

FT CONFLICT 51 51 Y -> H (IN REF. 1). 

FT CONFLICT 102 102 V -> L (IN REF. 1). 

SQ SEQUENCE 339 AA; 39385 MW; 5790BD91754C1C74 CRC64 ; 

Query Match 65.2%; Score 1111; DB 1; Length 339; 

Best Local Similarity 65.0%; Pred. No. 4.9e-64; 

Matches 217; Conservative 59; Mismatches 54; Indels 4; Gaps 3; 
Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

MM M hi MM M: : M hi Ml IMIIM -I M h- III 

Db 1 MPLFGKSQKSPVELVKSLKEAINALEAGDRKVEKAQEDVSKNLVSIKNMLYGSSDAEPPA 60 

Qy 64 E-AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 122 

MlhlMhl Ih II M lllilll I lllhlllllllllllllll 

Db 61 DYWAQLSQELYNSNLLLLLIQNLHRIDFEGKKHVALIFNNVLRRQIGTRSPTVEYICTK 120 

Qy 123 PHILFMLLKGYE--APQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 18 0 

I Ml h III hill I lllll hi Mlhl hM Ihllhllllll 

Db 121 PEILFTLMAGYEDAHPEIALNSGTMLRECARYEALAKIMLHSDEFFKFFRYVEVSTFDIA 18 0 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIF-EDYEKLLQSENYVTKRQSLKLLGELILDR 239 

Mlhllhlllllhl hlh III I : hMI MMIhlMIMMIhlM 

Db 181 SDAFSTFKELLTRHKLLCAEFLDANYDKFFSQHYQRLLNSENYVTRRQSLKLLGELLLDR 24 0 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

III UhllhllllllllhhHI llllllllllllllhl:! :||::|lhll 



Db 241 HNFTVMTRYISEPENLKLMMNMLKEKSRNIQFEAFHVFKVFVANPNKPKPILDILLRNQT 300 



Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

lh:|h:| :hUIII III lllllh:|| 
Db 301 KLVDFLTNFHTDRSEDEQFNDEKAYLIKQIKELK 334 

RESULT 6 

M02M_CAEEL 

ID M02M_CAEEL STANDARD; PRT; 338 AA. 

AC 018211; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 {Rel. 40, Last annotation update) 

DE Hypothetical M025-like protein Y53C12A.4 in chromosome II. 

GN Y53C12A.4. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCB I _Tax I D= 6 2 3 9 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bristol N2 ; 

RA Kershaw J . , Lennard N , ; 

RL Submitted (SEP-1997) to the EMBL/GenBank/DDBJ databases. 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; Z99277; CAB16486.1; -. 

DR FIR; T27129; T27129. 

DR WormPep; Y53C12A.4; CE14890. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 338 AA; 39431 MW; 1D0C34A35D9116F5 CRC64 ; 

Query Match 59.1%; Score 1006.5; DB 1; Length 338; 

Best Local Similarity 57.2%; Pred. No. 2.2e-57; 

Matches 191; Conservative 60; Mismatches 78; Indels 5; Gaps 1; 

Qy 5 PLFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAMKEILCGTNEK 59 

III h I l|::|| hi I :::: -I :|| || :| | | : |:: 

Db 4 PLFGKADKTPADWKNLRDALLVIDRHGTNTSERKVEKAIEETAKMLALAKTFIYGSDAN 63 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II I lllllhl:: :| II I :|| Mil : I I I : I I I I I I || || I I I I : 
Db 64 EPNNEQVTQLAQEVYNANVLPMLIKHLHKFEFECKKDVASVFNNLLRRQIGTRSPTVEYL 123 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

= 1 I II II III I III II MM :|ll Ihhhl h II :|: III 



Db 



124 AARPEILITLLLGYEQPDIALTCGSMLREAVRHEHLARIVLYSEYFQRFFVFVQSDVFDI 183 



Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

hllhllllhhil ^ hU: III I I I lllllhlllllMllhlll 

Db 184 ATDAFSTFKDLMTKHKNMCAEYLDNNYDRFFGQYSALTNSENYVTRRQSLKLLGELLLDR 243 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

Mh I llh lllll :| lllll llhllllllhllhhl :|| :|| :h 
Db 244 HNFSTMNKYITSPENLKTVMELLRDKRRNIQYEAFHVFKIFVANPNKPRPITDILTRNRD 303 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

lhllh:| :|hllll III lllllh:h 
Db 304 KLVEFLTAFHNDRTNDEQFNDEKAYLIKQIQELR 337 



RESULT 7 

YFV6_SCHP0 

ID YFV6_SCHP0 STANDARD; PRT; 32 9 AA. 

AC Q9P7Q8; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 {Rel, 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Hypothetical protein C1834.06c in chromosome I. 

GN SPAC1834.06C. 

OS Schizosaccharomyces pombe (Fission yeast) . 

OC Eukaryota; Fungi; Ascomycota; Schizosaccharomycetes; 

OC Schizosaccharomycetales; Schizosaccharomycetaceae; 

OC Schizosaccharomyces . 

OX NCBI_TaxID=4896; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=972; 

RX MEDLINE=2 18484 01; PubMed=1185 9360 ; 

RA Wood v., Gwilliam R. , Rajandream M.A., Lyne M. , Lyne R. , Stewart A., 

RA Sgouros J., Peat N., Hayles J., Baker S, , Basham D., Bowman S., 

RA Brooks K. , Brown D., Brown S-, Chillingworth T. , Churcher CM., 

RA Collins M., Connor R., Cronin A., Davis P., Feltwell T. , Eraser A., 

RA Gentles S., Goble A., Hamlin N. , Harris D. , Hidalgo J., Hodgson G. , 

RA Holroyd S., Hornsby T., Howarth S., Huckle E.J., Hunt S., Jagels K. , 

RA James K. , Jones L., Jones M., Leather S., McDonald S., McLean J., 

RA Mooney P., Moule S., Mungall K. , Murphy L. , Niblett D. , Odell C. , 

RA Oliver K. , O'Neil S., Pearson D. , Quail M.A,, Rabbinowitsch E., 

RA Rutherford K, , Rutter S., Saunders D. , Seeger K. , Sharp S., 

RA Skelton J., Simmonds M. , Squares R. , Squares S., Stevens K. , 

RA Taylor K. , Taylor R.G., Tivey A., Walsh S.V., Warren T. , Whitehead S., 

RA Woodward J., Volckaert G. , Aert R. , Robben J., Grymonprez B., 

RA Weltjens I., Vanstreels E., Rieger M. , Schaefer M. , Mueller-Auer S., 

RA Gabel C. , Fuchs M. , Fritzc C. , Holzer E. , Moestl D., Hilbert H. , 

RA Borzym K. , Langer I., Beck A., Lehrach H., Reinhardt R., Pohl T.M. , 

RA Eger P., Zimmermann W., Wedler H., Wambutt R., Purnelle B., 

RA Goffeau A., Cadieu E., Dreano S., Gloux S., Lelaure V,, Mottier S., 

RA Galibert F., Aves S.J., Xiang Z., Hunt C, Moore K. , Hurst S.M., 

RA Lucas M., Rochet M., Gaillardin C. , Tallada V.A. , Garzon A., Thode G. , 

RA Daga R.R., Cruzado L., Jimenez J., Sanchez M, , del Rey F., Benito J., 

RA Dominguez A,, Revuelta J,L., Moreno S., Armstrong J., Forsburg S.L., 

RA Cerrutti L., Lowe T. , McCombie W.R., Paulsen I., Potashkin J., 

RA Shpakovski G.V. , Ussery D. , Barrell B.G. , Nurse P.; 



RT "The genome sequence of Schizosaccharomyces pombe."; 

RL Nature 415 :871-880 (2002) . 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormat i cs and the EMBL outstation - 

CC the European Bioinf ormat ics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC 

DR EMBL; AL157734; CAB75774.1; -. 

DR PIR; T50117; T50117. 

DR GeneDB_S Pombe; SPAC1834 . 06c ; 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 329 AA; 38521 MW; 073DD0607A64C952 CRC64; 



Query Match 49.0%; Score 834.5; DB 1; Length 329; 

Best Local Similarity 51.5%; Pred. No. 1.9e-46; 

Matches 169; Conservative 63; Mismatches 93; Indels 3; Gaps 2; 



Qy 


6 


LFSKSHKNPAEIVKILKDNLAILE-KQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTE 


64 


Db 


4 


Ihl h ::|: 1 III II III h Mill II III! 1 II : 
LFNKRPKSTQDVVRCLCDNLPKLEINNDKK--KSFEEVSKCLQNLRVSLCGTAEVEPDAD 


61 


Qy 


65 


AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPH 
h h ::| II hi ::|| III Ih MM : M llhh III 


124 


Db 


62 


LVSDLSFQI YQSNLPFLLVRYLPKLEFESKKDTGLI FSALLRRHVASRYPTVDYMLAHPQ 


121 


Qy 


125 


ILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAF 


184 


Db 


122 


1 M: 1 :M 1 Mill Ml 1 ::M 1 1 II IMIMIIII 
IFPVLVSYYRYQEVAFTAGSILRECSRHEALNEVLLNSRDFWTFFSLIQASSFDMASDAF 


181 


Qy 


185 


ATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFAI 

MM M II Ihh :M hi IIMIIIIIIIIIIIIIIhMM 


244 


Db 


182 


STFKS I LLNHKSQVAEFI S YHFDEFFKQYTVLLKSENYVTKRQSLKLLGEI LLNRANRSV 


241 


Qy 


245 


MTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEF 

IIMII lllllll llllll IIIMIIilllMIIM h: :MII M: Ml ^ 

MTRYISSAENLKLMMILLRDKSKNIQFEAFHVFKLFVANPEKSEEVIEILRRNKSKLISY 


304 


Db 


242 


301 


Qy 


305 


LSSFQKERTDDEQFADEKNYLIKQIRDL 332 




Db 


302 


MM M Mill Ih :MIII 1 

LSAFHTDRKNDEQFNDERAF VI KQ I ERL 32 9 





RESULT 8 
DE76_CHLPR 

ID DE76_CHLPR STANDARD; PRT; 321 AA. 

AC Q9XFY6 ; 

DT 16-OCT-2001 (Rel . 40, Created) 
DT 16-OCT-2001 (Rel. 40, Last sequence update) 
DT 28-FEB-2003 (Rel. 41, Last annotation update) 
DE Degreening related gene dee76 protein. 



GN DEE76. 

OS Chlorella protothecoides . 

OC Eukaryota; Viridiplantae; Chlorophyta; Trebouxiophyceae; Chlorellales; 

OC Chlorellaceae; Auxenochlorella . 

OX NCBI_TaxID=3075 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-ACC25; 

RX MEDLINE=20256472; PubMed=10798614 ; 

RA Hortensteiner S., Chinner J., Matile P., Thomas H., Donnison I.S.; 

RT "Chlorophyll breakdown in Chlorella protothecoides: characterization 

RT of degreening and cloning of degreening-related genes."; 

RL Plant Mol. Biol. 42:439-450(2000). 

CC -!- SIMILARITY: Belongs to the Mo25 family, 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; AJ238632; CAB42595.1; 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF032 04; Mo25; 1. 

SQ SEQUENCE 321 AA; 37262 MW; 918FD02964B09071 CRC64; 

Query Match 45.5%; Score 776; DB 1; Length 321; 

Best Local Similarity 52.0%; Pred. No. 9.8e-43; 

Matches 156; Conservative 56; Mismatches 84; Indels 4; Gaps 3; 

Qy 32 DKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDF 91 

Db 19 ESKQDRWEDISKAIMSIKEAIFGEDEQSSSKEHAQGIASEACRVGLVSDLVTYLTVLDF 78 

Qy 92 EGKKDVTQIFNNILRRQI--GTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLREC 149 

I HII III hi : III :|: Ml ■ \ I III hill II I IN 
Db 79 ETRKDWQIFCAIIRITLEDGGR-PGRDYVLAHPDVLSTLFYGYEDPEIALNCGQMFREC 137 

Qy 150 IRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTI 209 



• I I I -11 -I I I • I • • • • I • • I I M M M I M Ml M • • M • 

Db 138 IRHEDIAKFVLECNLFEELFEKLNVQSFEVASDAFATFKDLLTRHKQLVAAFLQENYEDF 197 

Qy 210 FEDYEKLLQSENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNI 269 

I HII hlllhlllllllllhlll I II :hl II lllllhl I :| 
Db 198 FSQLDKLLTSDNYVTRRQSLKLLGELLLDRVNVKIMMQYVSDVNNLILMMNLLKDSSRSI 257 

Qy 270 QFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQI 329 

llllllllllllhhlhh :|h h Ih :| I :| ||||| :|| :||:| 
Db 258 QFEAFHVFKVFVANPNKTKPVADILVNNKNKLLTYLEDFHNDR-DDEQFKEEKAVIIKEI 316 

RESULT 9 
M02N_ARATH 

ID M02N_ARATH STANDARD; PRT; 343 AA. 

AC Q9FGK3; 



DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Hypothetical M025-like protein At5g47540. 

GN AT5G47540 OR MNJ7 . 13 . 

OS Arabidopsis thaliana (Mouse-ear cress) . 

OC Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta; 

OC Spermatophyta ; Magnoliophyta ; eudicotyledons ; core eudicots; Rosidae; 

OC eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

OX NCBI_TaxID=3702; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Columbia; 

RA Kaneko T., Katoh T. , Asamizu E,, Sato S., Nakamura Y. , Kotani H. , 

RA Tabata S . ; 

RT "Structural analysis of Arabidopsis thaliana chromosome 5. XI."; 

RL Submitted (APR-1999) to the EMBL/GenBank/DDBJ databases - 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AB025628; BAB09080.1; -. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 343 AA; 39457 MW; 46950D6A9A82FBB5 CRC64 ; 



Query Match 42.7%; Score 728; DB 1; Length 343; 

Best Local Similarity 43.2%; Pred. No. 1.2e-39; 

Matches 147; Conservative 79; Mismatches 100; Indels 14; Gaps 4; 

Qy 6 LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAMKEILCGTNE 58 

II : lh:|: :| I - U I h hh-: || || I :| 

Db 4 LFKSKPRTPADLVRQTRDLLLFSDRSTSLPDLRDSKRDEKMAELSRNIRDMKSILYGNSE 63 

Qy 59 KEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEY 118 

II li III II : I II I I =11 Ih h hh U U 

Db 64 AEPVAEACAQLTQEFFKEDTLRLLITCLPKLNLETRKDATQWANLQRQQVNSRLIASDY 123 

Qy 119 ISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFD 178 

U: :h:hl HI I I I i I II h : I I : I I : : II l-j II 

Db 124 LEANIDLMDVLIEGFENTDMALHYGAMFRECIRHQIVAKYVLESDHVKKFFDYIQLPNFD 183 

Qy 179 lASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDY-EKLLQSENYVTKRQSLKLLGELIL 237 

Ihll lllhllllM Ihll :| I III llhl ||:|:||::|IM:::| 
Db 184 lAADAAATFKELLTRHKSTVAEFLTKNEDWFFADYNSKLLESSNYITRRQAIKLLGDILL 243 

Qy 238 DRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKN 297 

II I hlllhl :||:::||ll|: I :|| llllllhl h :| M Ih I 
Db 244 DRSNSAVMTKYVSSRDNLRILMNLLRESSKSIQIEAFHVFKLFAANQNKPADIVNILVAN 3 03 



Qy 298 QPKLIEFLSSFQKERTDDEQFADEKNYLIKQI RDL 332 

: II: |: : :: :||:| :|: ::::| ||| 
Db 304 RSKLLRLLADLKPDK-EDERFEADKSQVLREIAALEPRDL 342 



RESULT 10 
M02M_ARATH 

ID iyi02M_ARATH STANDARD; PRT; 343 AA. 

AC Q9M0M4; 023570; 

DT 16'-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel . 40, Last sequence update) 

DT 15-SEP-2003 (Rel . 42, Last annotation update) 

DE Hypothetical M025-like protein At4gl7270. 

GN AT4G17270 OR DL4670W, 

OS Arabidopsis thaliana (Mouse-ear cress) . 

DC Eukaryota; Viridiplantae; Streptophyta ; Embryophyta ; Tracheophyta ; 

DC Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; Rosidae; 

OC eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

OX NCBI_TaxID=3702; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Columbia; 

RX MEDLINE=98121113; PubMed=9461215 ; 

RA Bevan M. , Bancroft I . , Bent E. , Love K. , Goodman H.M. , Dean C. , 

RA Bergkamp R. , Dirkse W. , van Staveren M. , Stiekema W. , Drost L. , 

RA Ridley P., Hudson S.-A., Patel K. , Murphy G., Piffanelli P., 

RA Wedler H., Wedler E., Wambutt R., Weitzenegger T. , Pohl T. , Terryn N. 

RA Gielen J., Villarroel R. , De Clercq R., van Montagu M. , Lecharny A., 

RA Aubourg S. , Gy I . , Kreis M. , Lao N. , Kavanagh T. , Hempel S. , 

RA Kotter P., Entian K.-D., Rieger M., Schaefer M., Funk B., 

RA Mueller-Auer S., Silvey M., James R, , Monfort A., Pons A., 

RA Puigdomenech P., Douka A., Voukelatou E., Milioni D. , Hatzopoulos P., 

RA Piravandi E., Obermaier B., Hilbert H. , Duesterhoeft A., Moores T., 

RA Jones J.D.G,, Eneva T. , Palme K., Benes v., Rechmann S., Ansorge W. , 

RA Cooke R., Berger C. , Delseny M., Voet M., Volckaert G. , Mewes H.-W., 

RA Klosterman S., Schueller C. , Chalwatzis N. ; 

RT "Analysis of 1.9 Mb of contiguous sequence from chromosome 4 of 

RT Arabidopsis thaliana."; 

RL Nature 391:485-488(1998). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-cv. Columbia; 

RX MEDLINE=20083488; PubMed=10617198 ; 

RA Mayer K.F.X., Schueller C. , Wambutt R. , Murphy G., Volckaert G. , 

RA Pohl T., Duesterhoeft A., Stiekema W. , Entian K.-D,, Terryn N. , 

RA Harris B., Ansorge W., Brandt P., Grivell L.A. , Rieger M. , 

RA Weichselgartner M., de Simone V., Obermaier B., Mache R. , Mueller M., 

RA Kreis M. , Delseny M., Puigdomenech P., Watson M. , Schmidtheini T. , 

RA Reichert B., Portetelle D. , Perez-Alonso M. , Boutry M. , Bancroft I., 

RA Vos P., Hoheisel J., Zimmermann W., Wedler H. , Ridley P., 

RA Langham S.-A., McCullagh B., Bilham L., Robben J., 

RA Van der Schueren J., Grymonprez B., Chuang Y.-J., Vandenbussche F., 

RA Braeken M. , Wei tj ens I , , Voet M, , Bastiaens I . , Aert R. , Defoor E. , 

RA Weitzenegger T., Bothe G. , Ramsperger U., Hilbert H., Braun M., 

RA Holzer E. , Brandt A., Peters S., van Staveren M., Dirkse W., 

RA Mooijman P., Klein Lankhorst R. , Rose M., Hauf J., Koetter P., 

RA Bemeiser S., Hempel S., Feldpausch M., Lamberth S., Van den Daele H. 



RA De Keyser A., Buysshaert C, , Gielen J., Villarroel R. , De Clercq R., 

RA Van Montagu M., Rogers J., Cronin A,, Quail M., Bray-Allen S., 

RA Clark L., Doggett J., Hall S., Kay M. , Lennard N., McLay K. , Mayes R., 

RA Pettett A., Rajandream M.A., Lyne M. , Benes V., Rechmann S., 

RA Borkova D. , Bloecker H. , Scharfe M. , Grimm M. , Loehnert T.-H., 

RA Dose S., de Haan M. , Maarse A.C., Schaefer M, , Mueller-Auer S., 

RA Gabel C., Fuchs M, , Fartmann B., Granderath K. , Dauner D., Herzl A., 

RA Neumann S., Argiriou A., Vitale D. , Liguori R. , Piravandi E., 

RA Massenet 0., Quigley F., Clabauld G., Muendlein A., Felber R. , 

RA Schnabl S., Hiller R. , Schmidt W. , Lecharny A., Aubourg S., 

RA Chefdor F. , Cooke R. , Berger C, Monfort A., Casacuberta E., 

RA Gibbons T. , Weber N., Vandenbol M., Bargues M. , Terol J., Torres A., 

RA Perez-Perez A., Purnelle B., Bent E., Johnson S., Tacon D. , Jesse T. , 

RA Heijnen L. , Schwarz S., Scholler P., Heber S., Francs P., Bielke C, 

RA Frishman D., Haase D. , Lemcke K. , Mewes H.-W., Stocker S., 

RA Zaccaria P., Bevan M., Wilson R.K., de la Bastide M., Habermann K. , 

RA Parnell L. , Dedhia N. , Gnoj L. , Schutz K. , Huang E., Spiegel L. , 

RA Sekhon M. , Murray J., Sheet P., Cordes M. , Abu-Threideh J., 

RA Stoneking T. , Kalicki J., Graves T. , Harmon G., Edwards J., 

RA Latreille P., Courtney L. , Cloud J., Abbott A., Scott K. , Johnson D., 

RA Minx P., Bentley D., Fulton B., Miller N., Greco T. , Kemp K. , 

RA Kramer J., Fulton L. , Mardis E., Dante M. , Pepin K. , Hillier L. , 

RA Nelson J,, Spieth J., Ryan E. , Andrews S., Geisel C. , Layman D. , 

RA Du H., Ali J., Berghoff A., Jones K. , Drone K. , Cotton M. , Joshu C, 

RA Antonoiu B,, Zidanic , Strong C. , Sun H. , Lamar B., Yordan C. , 

RA Ma P., Zhong J., Preston R., Vil D. , Shekher M. , Matero A., Shah R., 

RA Swaby I.K., 0 ' Shaughnessy A. , Rodriguez M. , Hoffman J., Till S., 

RA Granat S., Shohdy N. , Hasegawa A., Hameed A., Lodhi M., Johnson A,, 

RA Chen E. , Marra M. , Martienssen R. , McCombie W.R.; 

RT "Sequence and analysis of chromosome 4 of the plant Arabidopsis 

RT thaliana."; 

RL Nature 402:769-777(1999). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Columbia; 

RA Shinozaki K. , Davis R.W., Ecker J.R., Theologis A. ; 

RT "RIKEN Arabidopsis full length cDNA clones (RAFLs) sequenced by the 

RT SSP consortium (Salk/Stanf ord/PGEC) . " ; 

RL Submitted (MAY-2001) to the EMBL/GenBank/DDBJ databases. 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

CC -!~ CAUTION: Ref . 1 sequence differs from that shown due to erroneous 
CC gene model prediction. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; Z97343 ; CAB10508.1; ALT_SEQ. 

DR EMBL; AL161546; CAB78730.1; 

DR EMBL; AF380659; AAK55740.1; 

DR InterPro; IPR0048 92; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 



SQ SEQUENCE 343 AA; 39650 MW; D340B49A4924B7D1 CRC64 ; 

Query Match 42.0%; Score 716.5; DB 1; Length 343; 

Best Local Similarity 42.9%; Pred. No. 6.5e-39; 

Matches 144; Conser-vative 78; Mismatches 105; Indels 9; Gaps 3; 

Qy 6 LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAMKEILCGTNE 58 

II : Ihlh :| I - I - hllh: :| II I :| 

Db 4 LFKSKPRTPADIVRQTRDLLLYADRSNSFPDLRESKREEKMVELSKSIRDLKLILYGNSE 63 

Qy 59 KEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEY 118 

II II III II : : I |: I I HI Ih h hh H H 

Db 64 AEPVAEACAQLTQEFFKADTLRRLLTSLPNLNLEARKDATQWANLQRQQVNSRLIAADY 123 

Qy 119 ISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFD 178 

: - h hi HI I I I I I I I I I I : I I : II hH II 

Db 124 LESNIDLMDFLVDGFENTDMALHYGTMFRECIRHQIVAKYVLDSEHVKKFFYYIQLPNFD 183 

Qy 179 lASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDY-EKLLQSENYVTKRQSLKLLGELIL 237 

Ihll lllhllllll Ihll =11 III llhl |h|:|h:|llh::| 
Db 184 lAADAAATFKELLTRHKSTVAEFLIKNEDWFFADYNSKLLESTNYITRRQAIKLLGDILL 243 

Qy 238 DRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKN 297 

II Ihll Ih I :|h:Hlllh I II IIIMIhllh :| i Ih' I 
Db 244 DRSNSAVMTKYVSSMDNLRILMNLLRESSKTIQIEAFHVFKLFVANQNKPSDIANILVAN 303 

Qy 2 98 QPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK 333 

: Ih h : :|hl H --I HI 

Db 304 RNKLLRLLADIKPDK-EDERFDADKAQWREIANLK 338 



RESULT 11 

HYMA_EMENI 

ID HYMA_EMENI STANDARD; PRT; 384 AA. 

AC 060032; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Conidiophore development protein hymA. 

GN HYMA. 

OS Emericella nidulans (Aspergillus nidulans) . 

OC Eukaryota; Fungi; Ascomycota; Pezizomycotina ; Eurotiomycetes ; 

OC Eurotiales; Trichocomaceae; Emericella. 

OX NCBI_TaxID=162425 ; 

RN [1] 

RP SEQUENCE FROM N.A, 

RX MEDLINE=99126010; PubMed=9928 93 0 ; 

RA Karos M. , Fischer R.; 

RT "Molecular characterization of HymA, an evolut ionarily highly 

RT conserved and highly expressed protein of Aspergillus nidulans ; 

RL Mol. Gen. Genet. 260:510-521(1999). 

CC -!- FUNCTION: Required for conidiophore development. 

CC -!- SUBCELLULAR LOCATION: Cytoplasmic. 

CC -!- SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 



CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; AJ001157; CAA04556.1; -. 

DR InterPro; IPR004892; M025. 

DR Pfam; PF03204; M025; 1. 

SQ SEQUENCE 384 AA; 44392 MW; 2E203D0D110C5FD6 CRC64 ; 



Query Match 39,1%; Score 666; DB 1; Length 384; 

Best Local Similarity 39.8%; Pred. No. 1.2e-35; 

Matches 140; Conservative 68; Mismatches 114; Indels 30; Gaps 5; 

Qy 12 KNPAEIVKILKDNIAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQ 71 

: h-h :|| 11- I :h:| I I I - I I I I I : I I I 

Db 11 RQPSDWRSIKDLLLRL-REPSTASKVEDELAKQLSQMKLMVQGTQELEASTDQVHALVQ 69 

Qy 72 ELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILR RQIGTRSPTVEYI -SAHPHIL 126 

111 I I I : I I I h : I I I I : I I I I : 

Db 70 AMLHEDLLYELAVALHNLPFEARKDTQTIFSHILRFKPPHGNSPDPPVISYIVHNRPEII 129 



Qy 127 FMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQ 164 

I UN I h II :|ll : :| llh 

Db 130 lELCRGYEHSQSAMPCGTILREALKFDVIAAIILYDQSKEGEPAIRLTEVQPNVPQRGTG 189 

Qy 165 -FRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEK-LLQSENY 222 

I II lh:::|ll Ih-lllll II :| hi I : hllhl 

Db 190 VFWRFFHWIDRGTFELSADAFTTFREILTRHKSLVTGYLATNFDYFFAQFNTFLVQSESY 24 9 

Qy 223 VTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVA 282 

lllllhllllhHIl h::| :h Mill I MM MM l|||||||| 

Db 250 VTKRQSIKLLGEILLDRANYSVMMRYVESGENLKLCMKLLRDDRKMVQYEGFHVFKVFVA 309 

Qy 283 SPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKK 334 

M h : Ih h M: II I -MIIMI llhM-ll I I 
Db 310 NPDKSVAVQRILINNRDRLLRFLPKFLEDRTDDDQFTDEKSFLVRQIELLPK 361 



RESULT 12 
M02L_ARATH 

ID M02L_ARATH STANDARD; PRT; 348 AA. 

AC Q9ZQ77; 

DT 16-OCT-2001 (Rel. 40, Created) 
DT 16-OCT-2001 (Rel. 40, Last sequence update) 
DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Hypothetical M025-like protein At2g03410. 

GN AT2G03410 OR T4iyi8 . 16 . 

OS Arabidopsis thaliana (Mouse-ear cress) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta ; eudicotyledons ; core eudicots; Rosidae; 

OC eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

OX NCBI_TaxID=3702 ; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Columbia; 



RX MEDLINE=20083487; PubMed=10617197 ; 

RA Lin X., Kaul S., Rounsley S.D., Shea Benito M.-I., Tovm CD., 

RA Fujii C.Y., Mason T.M., Bowman C.L., Barnstead M,E., Feldblyum T.V. , 

RA Buell C.R., Ketchum K.A. , Lee J.J., Ronning CM., Koo H.L., 

RA Moffat K.S,, Cronin L.A. , Shen M, , Pai G., Van Aken S., Umayam L., 

RA Tallon Gill J.E., Adams Carrera A.J., Creasy T.H., 

RA Goodman H.M., Somerville CR., Copenhaver G.P., Preuss D., 

RA Nierman W.C, White O, , Eisen J, A., Salzberg S.L., Fraser CM,, 

RA Venter J. C ; 

RT "Sequence and analysis of chromosome 2 of the plant Arabidopsis 

RT thaliana."; 

RL Nature 402:761-768(1999). 

CC -!- SIMILARITY; Belongs to the Mo25 family. 

CC , 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib. ch) . 

CC 

DR EMBL; AC006284; AAD17435.1; -. 

DR PIR; B84448; B84448 . 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 348 AA; 40000 MW; AB1D92EA2E2B900E CRC64 ; 



Query Match 37.1%; Score 632; DB 1; Length 348; 

Best Local Similarity 38.7%; Pred. No. 1.6e-33; 

Matches 133; Conservative 80; Mismatches 117; Indels 14; Gaps 5, 

Qy 6 LFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASE EVSKSLQAMKEILCGTNE 58 

II = I III: =1 :|: I --M = h :::: :||| | | 

Dt) 4 LFKNKSRLPGEIVRQTRDLIALAESEEEETDARNSKRLGICAELCRNIRDLKSILYGNGE 63 

Qy 59 KEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEY 118 

II II I m ^ I II : :| I :|| III |: ::|: I || 
Db 64 AEPVPEACLLLTQEFFRADTLRPLIKSIPKLDLEARKDATQIVANLQKQQVEFRLVASEY 123 

Qy 119 ISAHPHILFMLLKGYEAP-QIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTF 177 

- - 1-1 ^ ::|l Ihlhlh :|| II I II Ihl I 

Db 124 LESNLDVIDSLVEGIDHDHELALHYTGMLKECVRHQWAKYILESKNLEKFFDYVQLPYF 183 

Qy 178 DIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYE-KLLQSENYVTKRQSLKLLGELI 236 

hhll hHIIIII :|h I :| ilh -A lllh MM::: 

Db 184 DVATDASKIFRELLTRHKSTVAEYLAKNYEWFFAEYNTKLLEKGSYFTKRQASKLLGDVL 243 

Qy 237 LDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLK 296 

= 11 I :| Ihl :||::|llll|: : III ilihlhilj: :| : II Ih 
Db 244 MDRSNSGVMVKYVSSLDNLRIMMNLLREPTKNIQLEAFHIFKLFVANENKPEDIVAILVA 303 

Qy 297 NQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLK KTA 336 

|: |:: : : |: :| | :| :: :| | ||| 
Db 304 NRTKILRLFADLKPEK-EDVGFETDKALVMNEIATLSLLDIKTA 346 



RESULT 13 
HYM1_YEAST 

ID HYM1_YEAST STANDARD; PRT; 3 99 AA, 

AC P32464; 

DT Ol-OCT-1993 (Rel. 21, Created) 

DT Ol-OCT-1993 {Rel . 21, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE HYMl protein. 

ON HYMl OR YKL189W. 

OS Saccharomyces cerevisiae (Baker's yeast) . 

OC Eukaryota; Fungi; Ascomycota; Saccharomycot ina ; Saccharomycetes ; 

OC Saccharomycetales; Saccharomycetaceae; Saccharomyces . 

OX NCBI_TaxID=4932; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=GRF88; 

RX MEDLINE=93348778; PubMed=8394 042 ; , 

RA Cheret G., Mattheakis L.C., Sor F. ; 

RT "DNA sequence analysis of the YCN2 region of chromosome XI in 

RT Saccharomyces cerevisiae, " ; 

RL Yeast 9:661-667(1993). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=94205264; PubMed=8 154 185 ; 

RA Wiemann S., Voss H. , Schwager C, Rupp T., Stegemann J., 

RA Zimmermann J., Grothues D. , Sensen C. , Erfle H., Hewitt N. , 

RA Banrevi A. , Ansorge W. ; 

RT "Sequencing and analysis of 51.6 kilobases on the left arm of 

RT chromosome XI from Saccharomyces cerevisiae reveals 23 open reading 

RT frames including the FASl gene."; 

RL Yeast 9:1343-1348(1993). 

RN [3] 

RP SEQUENCE FROM N.A. 

RA Maia e Silva A., Bossier P., Vilela C, Femandes L. , Soares H. , 

RA Guerreiro P., Rodrigues -Pousada C. ; 

RL Submitted (MAR-1994) to the EMBL/GenBank/DDBJ databases. 

RN [4] 

RP GENE NAME. 

RX MEDLINE=20157038; PubMed=10655212 ; 

RA Borland S., Deegenaars M.L., Stillman D. J. ; 

RT "Roles for the Saccharomyces cerevisiae SDS3 , CBKl and HYMl genes in 

RT transcriptional repression by SIN3."; 

RL Genetics 154:573-586(2000). 

CC SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib, ch) . 

CC 

DR EMBL; X69765; CAA49422.1; -. 

DR EMBL; X74151; CAA52249.1; -. 

DR EMBL; Z28189; CAA82032.1; -. 



DR PIR; S34681; S34681. 

DR SGD; SOO 01672; HYMl . 

DR GO; 00:0005622; C : intracellular ; IDA. 

DR GO; GO:0016564; F : transcript ional repressor activity; IMP. 

DR GO; GO: 0007109; P : cytokinesis , completion of separation; IMP. 

DR GO; 00:0008360; P: regulation of cell shape; IGI . 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

SQ SEQUENCE 399 AA; 45853 MW; F48860754C892BA9 CRC64; 



Query Match 28.5%; Score 485; DB 1; Length 3 99; 

Best Local Similarity 33.0%; Pred. No. 4.3e-24; 

Matches 113; Conservative 75; Mismatches 138; Indels 16; Gaps 6 



Qy 7 FSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 62 

: h I h: : I II I I II H I I : I : I 

Db 16 WKKNPKTPSDYARLIIEQLNKFSSPSLTQDNKR-KVQEECTKYLIGTKHFIVGDTDPHPT 74 

Qy 63 TEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAH 122 

||: :| :: : : |: ::|| ::: ||: | : ||:|: : 

Db 75 PEAIDELYTAMHRADVFYELLLHFVDLEFEARRECMLIFSICLGYSKDNKFVTVDYLVSQ 134 

Qy 123 PHILFMLLKGYE APQIALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVELS 175 

I : ::h I I I I h llhU I HI! I ll- H 

Db 135 PKTISLMLRTAEVALQQKGCQDIFLTVGNMIIECIKYEQLCRIILKDPQLWKFFEFAKLG 194 

Qy 176 TFDIASDAFATFKDLLTRHKVLVA-DFL--EQNYDTIFEDYEKLLQSENYVTKRQSLKLL 232 

Db 195 NFEISTESLQILSAAFTAHPKLVSKEFFSNEINIIRFIKCINKLMAHGSYVTKRQSTKLL 254 

Qy 233 GELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVE 292 

Ih I I hi Ih Mlllhl h III hi llhllll Ihi h:h : 
Db 255 ASLIVIRSNNALMNIYINSPENLKLIMTLMTDKSKNLQLEAFNVFKVMVANPRKSKPVFD 314 

Qy 293 ILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKK 334 



Db 315 ILVKNRDKLLTYFKTFGLD-SQDSTFLDEREFIVQEIDSLPR 355 



RESULT 14 
M02L_CAEEL 

ID M02L_CAEEL STANDARD; PRT; 33 9 AA. 

AC Q9TZM2 ; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 16-OCT-2001 (Rel. 40, Last annotation update) 

DE Hypothetical M025-like protein T27C10.3 in chromosome I. 

GN T27C10.3. 

OS Caenorhabditis elegans, 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 
OC Rhabditidae; Peloderinae; Caenorhabditis. 
OX NCBI_TaxID=623 9 ; 
RN [1] 

RP SEQUENCE FROM N.A. 
RC STRAIN=Bristol N2 ; 
RA Zhu H.J., Graves T. , Hawkins M. ; 

RL Submitted (OCT-1998) to the EMBL/GenBank/DDBJ databases. 



CC SIMILARITY: Belongs to the Mo25 family. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; AF098504; AAC67411.1; -. 

DR PIR; T33477; T33477. 

DR WormPep; T27C10.3; CE19605. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF032 04; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 339 AA; 40232 MW; E7DA45CA33F2947E CRC64 ; 

Query Match 8.4%; Score 143.5; DB 1; Length 339; 

Best Local Similarity 19.3%; Pred. No. 0.02; 

Matches 38; Conservative 50; Mismatches 76; Indels 33; Gaps 4; 

Qy 159 ILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQ 218 

:: ||: | i : : : :|:: | : ||: 

Db 100 LMNTNKFRD FDVIQGTFDTLQIIFFTNHESANNFIKNNLPRFMQTLHKLIA 150 

Qy 219 SENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFK 278 

|: : :| I I II : |: : ::::| :||:: :: :: | : : 

Db 151 CSNFFIQAKSFKFLNELFTAQTNYETRSLWMAEPAFIKLWLAIQSNKHAVRSRAVSILE 210 

Qy 279 VFVASPHKTQPIVEILLKNQPKLIEFL SSFQKERTDDEQFAD 320 

:\: :\ : : \ : :|: || | I Ul I hi 
Db 211 IFIRNPRNSPEVHEFIGRNRNVLIAFFFNSAPIHYYQGSPNEKE DAQYARMAYKLLN 267 

Qy 321 ---EKNYLIKQIRDLKK 334 

Db 2 68 WDMQRPFTQEQLQDFEE 2 84 

RESULT 15 
AKA9_HUMAN 

ID AKA9_HUMAN STANDARD; PRT; 3 911 AA. 

AC Q99996; 014869; 043355; 094895; Q9UQH3 ; Q9UQQ4 ; Q9Y6B8 ; Q9Y6Y2 ; 

DT 16-OCT-2001 (Rel . 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-SEP-2003 (Rel. 42, Last annotation update) 

DE A-kinase anchor protein 9 (Protein kinase A anchoring protein 9) 

DE (PRKA9) (A-kinase anchor protein 450 kDa) (AKAP 450) (A-kinase anchor 

DE protein 350 kDa) (AKAP 350) (hgAKAP 350) (AKAP 120 like protein) 

DE (Hyperion protein) (Yotiao protein) (Centrosome- and Golgi -localized 

DE PKN-associated protein) (CG-NAP) . 

GN AKAP9 OR AKAP450 OR AKAP350 OR KIAA0803. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
OX NCB I_Tax I D= 9 6 0 6 ; 
RN [1] 



RP SEQUENCE FROM N.A. (ISOFORM 4) . 

RC TISSUE=Brain; 

RX MEDLINE=98151389; PubMecl=948278 9 ; 

RA Lin J.W., Wyszynski M. , Madhavan R. , Sealock R. , Kim J.U-, Sheng M. ; 

RT "Yotiao, a novel protein of neuromuscular junction and brain that 

RT interacts with specific splice variants of NMDA receptor subunit 

RT NRl . " ; 

RL J. Neurosci. 18:2017-2027(1998). 
RN [2] 

RP SEQUENCE FROM N.A. (ISOFORM 2), AND VARIANT GLN-1347 INS. 

RX MEDLINE=992 19864; PubMed=10202149 ; 

RA Witczak 0., Skaalhegg B.S., Keryer G., Bornens M., Tasken K. , 

RA Jahnsen T. , Oerstavik S.; 

RT "Cloning and characterization of a cDNA encoding an A-kinase anchoring 

RT protein located in the centrosome, AKAP450,"; 

RL EMBO J. 18:1858-1868(1999). 
RN [3] 

RP SEQUENCE FROM N.A. (ISOFORM 3) . 

RC TISSUE=Brain; 

RX MEDLINE=99287934; PiibMed=10358 086 ; 

RA Takahashi M., Shibata H. , Shimakawa M. , Miyamoto M. , Mukai H., Ono Y.; 

RT "Characterization of a novel giant scaffolding protein, CG-NAP, that 

RT anchors multiple signaling enzymes to centrosome and the Golgi 

RT apparatus . " ; 

RL J. Biol. Chem. 274:17267-17274(1999). 
RN [4] 

RP SEQUENCE FROM N.A. (ISOFORM 1) . 

RA Kemmner W.A. , Deiss S,, Schwarz U. ; 

RT "Cloning of Hyperion. "; 

RL Submitted (AUG-1998) to the EMBL/GenBank/DDBJ databases. 

RN [5] 

RP SEQUENCE OF 323-3911 FROM N.A. (ISOFORM 2) . 

RC TISSUE=Gastric parietal cell; 

RX MEDLINE=99115654; PubMed=9915845 ; 

RA Schmidt P.H., Dransfield D.T., Claudio J.O., Hawley R.G., 

RA Trotter K.W. , Milgram S.L., Goldenring J.R.; 

RT "AKAP350, a multiply spliced protein kinase A-anchoring protein 

RT associated with centrosomes . " ; 

RL J. Biol. Chem. 274:3055-3066(1999). 

RN [6] 

RP SEQUENCE OF 1802-3876 FROM N.A. (ISOFORM 5). 

RC TISSUE=Lymphoblast ; 

RA Hinds K. , Sutterer C. , Becker M., Hawkins M. ; 

RL Submitted (JAN-1998) to the EMBL/GenBank/DDBJ databases. 

RN [7] 

RP SEQUENCE OF 2157-3911 FROM N.A. (ISOFORM 6) . 

RC TISSUE=Lung; 

RA Milgram S.L., Goldenring J.R., Schmidt P.H.; 

RT "AKAP350: A multiply spliced family of proteins with centrosomal 

RT association. " ; 

RL Submitted (SEP-1998) to the EMBL/GenBank/DDBJ databases. 

RN [8] 

RP SEQUENCE OF 2212-3911 FROM N.A. (ISOFORM 2/3). 

RC TISSUE=Brain; 

RX MEDLINE=99087487; PubMed=98724 52 ; 

RA Nagase T. , Ishikawa K.-I., Suyama M. , Kikuno R. , Miya j ima N. , 

RA Tanaka A., Kotani H. , Nomura N., Ohara 0.; 



RT "Prediction of the coding sequences of unidentified human genes. XI. 

RT The complete sequences of 100 new cDNA clones from brain which code 

RT for large proteins in vitro."; 

RL DNA Res. 5:277-286(1998). 

RN [9] 

RP SEQUENCE OF 17-18 00 FROM N.A. 

RA Wu X . , Graves T . , Bradshaw H , ; 

RL Submitted (SEP-1998) to the EMBL/GenBank/DDBJ databases. 

CC -!- FUNCTION: BINDS TO TYPE II REGULATORY SUBUNITS OF PROTEIN KINASE 

CC A. SCAFFOLDING PROTEIN THAT ASSEMBLES SEVERAL PROTEIN KINASES AND 

CC PHOSPHATASES ON CENTROSOME AND GOLGI APPARATUS WHERE PHYSIOLOGICAL 

CC EVENTS CAN BE REGULATED BY PHOSPHORYLATION STATE OF PROTEIN 

CC SUBSTRATES. ISOFORM 4/YOTIAO IS ASSOCIATED WITH THE N-METHYL-D- 

CC ASPARTATE RECEPTOR AND IS SPECIFICALLY FOUND IN THE NEUROMUSCULAR 

CC JUNCTION (NMJ) AS WELL AS IN NEURONAL SYNAPSES EXPLAINING THAT ITS 

CC ROLE MAY BE TO ORGANIZE POSTSYNAPTIC SPECIALIZATIONS. 

CC -!- SUBUNIT: INTERACTS WITH THE REGULATORY REGION OF PROTEIN KINASE N 

CC (PKN) , PROTEIN PHOSPHATASE 2A (PP2A) , PROTEIN PHOSPHATASE 1 (PPl) 

CC AND THE IMMATURE NON-PHOSPHORYLATED FORM OF PKC EPSILON. 

CC -!- SUBCELLULAR LOCATION: CENTROSOMAL IN MANY CELL TYPES AND 

CC CYTOPLASMIC IN PARIETAL CELLS. 

CC -!- ALTERNATIVE PRODUCTS: 

CC Event ^Alternative splicing; Named isoforms=6; 

CC Name=l; 

CC IsoId=Q99996-l; Sequence=Di splayed; 

CC Name=2; 

CC IsoId=Q99996-2; Sequence=VSP_004102 , VSP_004107; 

CC Name=3; Synonyms -CG -NAP; 

CC IsoId=Q99996-3; Sequence=VSP_004 102 , VSP_004105, VSP__004107; 

CC Name=4; Synonyms =Yotiao; 

CC IsoId=Q99996-4; Sequence=VSP_004103 , VSP_004104; 

CC Name=5; 

CC IsoId=Q99996-5; Sequence=VSP__004108 ; 

CC Name=6; Synonyms=AKAP350 ; 

CC IsoId=Q99996-6; Sequence=VSP_004106 , VSP__004107, VSP_004109; 

CC -!- TISSUE SPECIFICITY: WIDELY EXPRESSED. ISOFORM 4/YOTIAO IS HIGHLY 
CC EXPRESSED IN SKELETAL MUSCLE AND IN PANCREAS. 

CC -!- DOMAIN: RII BINDING SITE, PREDICTED TO FORM AN AMPHIPATHIC HELIX, 

CC COULD PARTICIPATE IN PROTEIN- PROTEIN INTERACTIONS WITH A 

CC COMPLEMENTARY SURFACE ON THE R-SUBUNIT DIMER. 

CC -!- CAUTION: REF . 6 SEQUENCE DIFFERS FROM THAT SHOWN DUE TO TWO 

CC FRAMESHIFTS IN POSITIONS 3782 AND 3811, 

CC -!- CAUTION: REF. 9 SEQUENCE DIFFERS FROM THAT SHOWN DUE TO FOUR 
CC FRAMESHIFTS IN POSITIONS 29, 1653, 1699 AND 1735. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb'-sib.ch/announce/ 

CC or send an email to licenseOisb-sib. ch) . 

CC 

DR EMBL; AJ131693; CAB40713.1; -. 

DR EMBL; AB019691; BAA78718,1; 

DR EMBL; AJ010770; CAA09361.1; -. 

DR EMBL; AF026245; AAB86384.1; 



UK 


EMBL; AF083037; 


AAD22767 


• X , 




DK 


EMBL; AC004013; 


AAB96867 


• 1; 


ALT_FRAME . 


UK 


EMBL; AF091711; 


AAD39719 


.1; 




UK 


EMBL; AB018 


346; 


BAA34523 


.1; 




UK 


EMBL; AC000066; 


AAC60380 


.1; 


ALT_FRAME . 


UK 


Genew; HGNC 


:379; 


AKAP9. 






UK 


MIM; 604001; -. 










DR 


GO; GO: 0005813; 


c- 


centrosome ; TAS . 


DR 


GO; GO:0005856; 


C:cytoskeleton; TAS. 


DR 


GO; GO: 0004973; 


F 


N-methyl- 


D-aspartate receptor-associated pr. . TAS 


DR 


GO; GO: 0005515; 


F: protein binding activity; TAS. 


UK 


GO; GO: 0007165; 


P 


signal 


transduction; TAS . 


UK 


GO; GO: 0006832; 


P 


small molecule transport; TAS. 


UK 


GO; GO:0007268; 


P 


synaptic 


transmission; TAS. 


JvW 




; Alternative 


spl icing ; Polymorphism . 
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/FTId=VSP_004103 . 


17 rp 

r 1 


VARSPLIC 


1643 




3911 




Missing (in isoform 4) . 


FT 












/FTId=VSP_004104 . 


17 rp 

r 1 


VARSPLIC 


2175 




2182 




Missing (in isoform 3) . 


b 1 












/FTId=VSP_004105. 


FT 


VARSPLIC 


2175 




2183 




SADTFQKVE -> Q (in isoform 6) . 


I-lfTl 
b i 












/FTId=VSP_004106. 


FT 


VARSPLIC 


2895 




2907 




VFGFYNMCFSTLC -> GSSI PELAHSDAYQTREICSS 














(in isoform 2, isoform 3 and isoform 6) . 


FT 












/FTId-VSP_004107. 


r 1 


VARSPLIC 


2895 




2948 




Missing (in isoform 5) . 


T7T 
r 1 












/FTId=VSP_004108 . 


r 1 


VARSPLIC 


3901 




3911 




STTQFHAGMRR -> ALSLTTSWQHHSARPTAPLFFEILSH 


b i 












SLG (in isoform 6) . 


FT 












/FTId=VSP_004109. 


FT 


VARIANT 


1347 




1347 




K -> KQ. 


FT 












/FTId-VAR_010926. 


FT 


CONFLICT 


76 




76 




E -> Q (IN REF, 3) . 


FT 


CONFLICT 


475 




475 




M -> I (IN REF. 3) . 


FT 


CONFLICT 


554 




554 




E -> G (IN REF. 3) . y 


FT 


CONFLICT 


638 




638 




R -> S (IN REF. 3) . 
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D D J 
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PT 


POMPT.TPT 


-/ J D 


^ J o 


K - "> NT 


^ TM 


PPP 






r 1 


POMPT .T PT 


_? o u 


9ft9 




PKP 


V X XM 


PPP 1 


AND 


PT 
r X 


POKIPT.TPT 


Zf Zf 1 


997 




f TKT 

^ X i'J 


RPP 


1 AND 




PT 


POKrPT.T PT 

wXM C XJ X X 


1001 


1001 


n - ^ p 


(IN 


REF. 


1 AMD 


2) 


r 1 


PHMPI.T PT 






IN > U 


(IN 


REF. 


J ; . 




FT 


CONFLICT 


1028 


1028 


V -> E 


(IN 


REF. 


3) . 




FT 


CONFLICT 


1626 


1626 


R -> P 


(IN 


REF. 


1 AND 


2) . 


FT 


CONFLICT 


1703 


1703 


N -> T 


(IN 


REF. 


3) . 




FT 


CONFLICT 


1707 


1707 


V -> G 


(IN 


REF. 


3) . 




FT 


CONFLICT 


1802 


1803 


MISSING 


(IN REF 


. 5) . 




FT 


CONFLICT 


1843 


1843 


A -> P 


(IN 


REF. 


3) . 





Query Match 7.5%; Score 128.5; DB 1; Length 3911; 

Best Local Similarity 20.1%; Pred. No. 3.3; 

Matches 77; Conservative 75; Mismatches 116; Indels 115; Gaps 



15; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



18 VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVAQLAQELYSSG 77 

Mill I II : I hh - h Ih M I 
664 IEKLKDNLGIHYKQ--QIDGLQNEMSQKIETMQ FEKDNLITKQNQLILE 710 

78 LLVTLIADLQ--LIDFEGKKDVTQIFNNILRRQI GTRSPTVEYISAHPHI 125 

:: : ||| |:: : :: || | II h : 

711 --ISKLKDLQQSLVNSKSEEMTLQI--NELQKEIEILRQEEKEKGTLEQEVQELQLKTEL 766 

126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKI ILFSNQFRDFFKYVELSTFDIASDAFA 185 

I :| I I - I :| I I 

767 LEKQMKEKE NDLQEKFAQLEAEN-SILKDEKK 797 

186 TFKDLLTRH KVLVADFLE-QNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

I :hl I I - - h-l -I II hi -| h 

798 TLEDMLKIHTPVSQEERLIFLDSIKSKSKDSVWEKEIEILIEENEDLKQQCIQLNEEIEK 857 

238 DRHNFAIMTK YISKPENLKLMMNLLRD 264 

h h I I II : I -I I 

858 QRNTFSFAEKNFEVNYQELQEEYACLLKVKDDLEDSKNKQELEYKSKLKALNEELHLQRI 917 

265 KSPNIQFEA--FHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTD-DEQFAD- 320 

:: :: I | ||| :| : |:: |: :|:| | ::|: : :: :| 
918 NPTTVKMKSSVFDEDKTFVA---ETLEMGEWEKDTTELMEKLEVTKREKLELSQRLSDL 974 

321 EKNYLIKQIRDLKK 334 

I ::| :::: Ih 
975 SEQLKQKHGEISFLNEEVKSLKQ 997 



Search completed: January 7, 2004, 16:45:28 
Job time : 20 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: January 7, 2004, 16:44:17 ; Search time 41 Seconds 

(without alignments) 
2121.067 Million cell updates/sec 

US-10-088-872-2 
Perfect score: 1704 

1 MKKMPLFSKSHKNPAEIVKI FADEKNYLIKQIRDLKKTAP 337 



Title: 
Perf ec 
Sequence: 

Scoring table 
Searched : 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 

830525 seqs, 258052604 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



830525 



Database 



SPTREMBL_23 : * 
1 : sp^archea : * 



2 

3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 



sp_bacteria : * 
sp_f ungi : * 
sp_human : * 
sp_invertebrate : * 
sp_mammal : *■ 
sp_mhc : * 
sp_organelle : * 
sp_phage : * 

sp_plant : * 

sp_rodent : * 

sp_virus : * 

sp_vertebrate : * 

sp_unclassif ied: * 

sp_rvirus : * 

sp_bacteriap: * 

sp_archeap: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 
No. 



Score 



Query 

Match Length DB 



ID 



Description 



1 


1684 


98 


. 8 


337 


11 


Q8BG52 


Q8]3g52 mus musculu 


2 


1669 


97 


.9 


334 


11 


Q91WB8 


Q91wb8 mus musculu 


3 


1663 


97 


.6 


334 


11 


Q91YL0 


Q91ylO mus musculu 


4 


1462 


85 


.8 


289 


4 


Q96FG1 


Q96fgl homo sapien 


5 


1381 


81 


.0 


341 


11 


Q8VDZ8 


Q8vdz8 mus musculu 


6 


1066-5 


62 


.6 


636 


5 


Q21643 


Q21643 c3eno2rh.at)di 


7 


875 


51 


.3 


205 


11 


Q8K312 


Q8k312 mus musculu 


8 


709.5 


41 


.6 


333 


10 


Q8H5L9 


Q8h519 oryza sativ 


9 


671.5 


39 


.4 


345 


10 


Q8L9L9 


Q81919 arabidopsis 


10 


590 


34 


.6 


322 


10 


Q8LIF3 


Q81if3 oiryza sativ 


11 


435 


25 


.5 


103 


11 


Q8K038 


Q8k038 mus musculu 


12 


134.5 


7 


.9 


677 


16 


025188 


025188 helicobacte 


13 


128 


7 


.5 


430 


16 


026049 


02604 9 helicobacte 


14 


123 . 5 


7 


2 


1285 


16 


Q9WXU3 


Q9wxu3 theirmotoga 


15 


120 


7 


.0 


1175 


17 


Q58914 


Q58914 methanococc 


16 


119.5 


7 


.0 


1056 


16 


Q8REF7 


Q8ref7 fusobacteri 


17 


119 


7 


.0 


1111 


5 


Q9VGE4 


Q9vge4 drosophila 


18 


118.5 


7 


.0 


554 


5 


Q8IN90 


Q8in90 drosophila 


19 


118 . 5 


7 


.0 


670 


5 


Q9VEC7 


Q9vec7 drosophila 


20 


118.5 


7 


.0 


670 


5 


Q9NFM7 


Q9nfm7 drosophila 


21 


117 


6 


.9 


808 


5 


Q8T133 


Q8tl33 dictyosteli 


22 


117 


6 


9 


808 


5 


Q9GSH4 


Q9gsh4 dictyosteli 


23 


116.5 


6 


8 


1135 


5 


Q9NJQ4 


Q9njq4 Paramecium 


24 


116 


6 


8 


911 


16 


Q8EUI7 


Q8eui7 mycoplasma 


25 


116 


6 


8 


1389 


5 


Q8I293 


Q8i293 Plasmodium 


26 


115.5 


6 


8 


1111 


5 


Q9U0K5 


O9u0k:5 Dlasmodium 


27 


115.5 


6 


8 


1946 


5 


097291 


097291 Plasmodium 


28 


115 


6 


7 


473 


11 


Q8R436 


Q8r436 mus musculu 


29 


115 


6 


7 


2518 


5 


Q8IEH2 


Q8ieh2 Plasmodium 


30 


114.5 


6 


7 


1941 


5 


Q8IAK6 


Q8iaki6 Plasmodium 


31 


114 


6 


7 


743 


13 


Q9YGE7 


09vcre7 oncorhvnchu 


32 


113.5 


6 


7 


833 


4 


Q9UF54 


Q9uf54 homo sapien 


33 


113.5 


6 


7 


951 


5 


Q9VEC6 


Q9vec6 drosophila 


34 


113 .5 


6 


7 


984 


5 


Q8IN89 


08in8 9 drosonhila 


35 


113 


6 


6 


474 


5 


097233 


097233 Plasmodium 


36 


113 


6 


6 


647 


11 


Q8CA10 


Q8cal0 mus musculu 


37 


111.5 


6 


5 


1925 


5 


Q8I2D1 


Q8i2dl Plasmodium 


38 


111. 5 


6. 


5 


2429 


5 


Q9VFB1 


Q9vfbl drosophila 


39 


111.5 


6 


5 


2771 


5 


Q26216 


Q26216 Plasmodium 


40 


111 


6. 


5 


2166 


16 


051465 


051465 borrelia bu 


41 


111 


6. 


5 


2819 


16 


Q98QP8 


Q98qp8 mycoplasma 


42 


110 


6. 


5 


461 


5 


077390 


077390 Plasmodium 


43 


110 


6. 


5 


1183 


2 


086064 


086064 helicobacte 


44 


110 


6. 


5 


1758 


5 


Q8I1K5 


Q8ilk5 Plasmodium 


45 


109.5 


6. 


4 


457 


16 


Q9PQiyiO 


Q9pqm0 ureaplasma 



ALIGNMENTS 



RESULT 1 
Q8BG52 

ID Q8BG52 PRELIMINARY 
AC Q8BG52; 

DT Ol-MAR-2003 (TrEMBLrel . 
DT Ol-MAR-2003 (TrEMBLrel. 
DT Ol-MAR-2003 (TrEMBLrel. 



PRT; 337 AA. 
23, Created) 

23, Last sequence update) 
23, Last annotation update) 



DE iyi025-like protein homolog. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Eye, Pituitary, and Testis; 

RX MEDLINE=22354683 ; PubMed=12466851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; AK030474; BAC26978.1; 

DR EMBL; AK053642; BAC35457.1; 

DR EMBL; AK076758; BAC36470-1; ~. 

SQ SEQUENCE 337 AA; 39105 MW; C62B5B58095A98C8 CRC64 ; 

Query Match 98.8%; Score 1684; DB 11; Length 337; 

Best Local Similarity 98.5%; Pred. No. l.le-110; 

Matches 332; Conservative 2; Mismatches 3; Indels 0; Gaps 0 
Qy 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKE 60 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIMIhll 

Db 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNDKE 60 

Qy 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYIS 120 

llllllllllllllllllllllllilllllllllllllllllMIIIIIIII lilllll 

Db 61 PPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRCPTVEYIS 120 

Qy 121 AHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

HIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIilllllllllMIIIIII 

Db 121 SHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIA 180 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 24 0 

lllllillMlllllllllilillllllllllllllMlilllMIIIIIIIIIIIIIII 

Db 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRH 24 0 

Qy 241 NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 3 00 

II IIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
Db 241 NFTIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPK 300 

Qy 301 LIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

llllllllllllllllllllllllilllllllll II 

Db 301 LIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKAAP 337 

RESULT 2 
Q91WB8 

ID Q91WB8 PRELIMINARY; PRT; 334 AA. 

AC Q91WB8; 

DT Ol-DEC-2 001 (TrEMBLrel. 19, Created) 

DT Ol-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Similar to hypothetical protein FLJ12577 (M025-like protein 

DE homolog) . 



OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCB I_TaxI D= 10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Salivary gland; 

RA Strausberg R.; 

RL Submitted (OCT-2001) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Test is ; 

RX MEDLINE=22354683; PubMed=12466851 ; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

DR EMBL; BC016128; AAH16128.1; -. 

DR EMBL; AK076867; BAC36513.1; -. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 334 AA; 38718 MW; 822F04A87FB4EB6F CRC64; 

Query Match 97.9%; Score 1669; DB 11; Length 334; 

Best Local Similarity 98.5%; Pred. No. 1.3e-109; 

Matches 329; Conservative 2; Mismatches 3; Indels 0; Gaps 0 

Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

IIIIIIIIMIIMIMIIIilllllllllllllllilllllllllllMlllhlllll 

Db 1 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNDKEPPT 60 

Qy 64 EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHP 123 

lllllllllllllllllllllllllllllllllllllllllllllllll IIIMIhll 

Db 61 EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRCPTVEYISSHP 120 

Qy 124 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 183 

IIIIIIMIIMIIIIIMIIIIMMIIIIIIIIMIIIIIMIIIIIIIIIIIIIIII 

Db 121 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 180 

Qy 184 FATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI LDRHNFA 243 

llllllllllllllillllllllllllllllllllllllllllllllllllllllllll 

Db 181 FATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLLGELI LDRHNFT 240 

Qy 244 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 3 03 

IIIIIIIIIIMIMIMIIIIIIIIIIIIIIilllillllllllillllllllllllll 

Db 241 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 300 

Qy 304 FLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

MIMIIIIIIIIIIMIIIIIIIIIillll II 

Db 3 01 FLSSFQKERTDDEQFADEKNYLIKQIRDLKKAAP 334 



RESULT 3 

Q91YL0 

ID Q91YL0 



PRELIMINARY; PRT; 334 AA. 



AC Q91YL0; 

DT Ol-DEC-2001 (TrEMBLrel . 19, Created) 

DT Ol-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT Ol-JUN-2002 (TrEMBLrel. 21, Last annotation update) 

DE Similar to hypothetical protein FLJ12577. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Strausberg R. ; 

RL Submitted (OCT-2001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC016546; AAH16546.1; 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 334 AA; 38761 MW; 5F9765360653750E CRC64 ; 

Query Match 97.6%; Score 1663; DB 11; Length 334; 

Best Local Similarity 98.2%; Pred. No. 3.3e-109; 

Matches 328; Conservative 2; Mismatches 4; Indels 0; Gaps 0 
Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

IIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIMIhlllM 

Db 1 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNDKEPPT 60 

Qy 64 EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHP 123 

lllllllilllllllllllllllllllllllllllllllllllllllll llllllhll 
Db 61 EAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRCPTVEYISSHP 120 

Qy 124 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 183 

IIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIMIIIIIIIIIIIIIIII 

Db 121 HILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDA 180 

Qy 184 FATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFA 243 

llllllllllllllllllllllllllllllllilllllllllllllll IIIIIIIMI 
Db 181 FATFKDLLTRHKVLVADFLEQNYDTI FEDYEKLLQSENYVTKRQSLKLRGELI LDRHNFT 240 

Qy 244 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 303 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
Db 241 IMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIE 300 

Qy 3 04 FLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

lllllllllllllllllllllllllllllll II 
Db 301 FLSSFQKERTDDEQFADEKNYLIKQIRDLKKAAP 334 

RESULT 4 
Q96FG1 

ID Q96FG1 PRELIMINARY; PRT; 289 AA. 

AC Q96FG1; 

DT Ol-DEC-2001 (TrEMBLrel. 19, Created) 

DT Ol-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE Hypothetical protein. 

OS Homo sapiens (Human) . 



OC Eukaryota; Metazoa; Chordata; Crania ta; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCB I _Tax I D = 9 6 0 6 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TI SSUE-Placenta ; 

RA Strausberg R. ; 

RL Submitted (JUL-2001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC010993; AAH10993.1; -. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 289 AA; 33738 MW; F57B9EFCF6ABF2D7 CRC64 ; 

Query Match 85.8%; Score 1462; DB 4; Length 289; 

Best Local Similarity 99.7%; Pred. No. 3.8e-95; 

Matches 288; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 49 MKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQ 108 

ililllllMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIII Iilllllllllilll 

Db 1 MKEILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEEKKDVTQIFNNILRRQ 60 

Qy 109 IGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDF 168 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIII 

Db 61 IGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDF 120 

Qy 169 FKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQS 228 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
Db 121 FKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQS 180 

Qy 229 LKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQ 288 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIII 

Db 181 LKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQ 240 

Qy 289 PIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 337 

IIIIIIIIIIIIIMIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIII 
Db 241 PIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTAP 289 



RESULT 5 

Q8VDZ8 

ID Q8VDZ8 PRELIMINARY; PRT; 341 AA. 

AC Q8VDZ8; 

DT Ol-MAR-2002 (TrEMBLrel . 20, Created) 

DT Ol-MAR-2002 (TrEMBLrel. 20, Last sequence update) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last annotation update) 

DE M025 protein. 

GN CAB3 9. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Strausberg R. ; 

RL Submitted (DEC-2001) to the EMBL/GenBank/DDBJ databases, 

DR EMBL; BC020041; AAH20041.1; -. 



DR MGD; MGI: 107438; Cab39. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 

SQ SEQUENCE 341 AA; 39843 MW; E7FECA529D6FE811 CRC64 ; 

Query Match 81.0%; Score 1381; DB 11; Length 341; 

Best Local Similarity 81.0%; Pred. No. 2.3e-89; 

Matches 273; Conservative 31; Mismatches 29; Indels 4; Gaps 2 

Qy 4 MPL-FSKSHKNPAEIVKILKDNLAILEKQ---DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II I lllhlhlll l|:::|:||ll III Uhllllhl Mllll Mill 
Db 1 MPFPFGKSHKSPADIVKNLKESMAVLEKQDISDKKAEKATEEVSKNLVAMKEILYGTNEK 60 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II lllllllllllhllll Ihllllllllllllll lllllllllllilhllllll 
Db 61 EPQTEAVAQLAQELYNSGLLGTLVADLQLIDFEGKKDVAQIFNNILRRQIGTRTPTVEYI 120 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

HIIIIMIIhhIII llllllllllllllllllhl II llhllhlllll 
Db 121 CTQQNILFMLLKGYESPEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDI 180 

Qy 180 ASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDR 239 

Mill mill llllhl |:||l|:|l I MUM llllllllllllllllhlll 
Db 181 ASDAFATFKDLLTRHKLLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDR 240 

Qy 240 HNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQP 299 

Ml IIIIMIMIIIIIMIIMIII IIIMIIIIIIIMhhIlllhMIIIII 

Db 241 HNFTIMTKYISKPENLKLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQT 300 

Qy 300 KLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKTA 336 

Mill II II MIMIII III IhllllMh I 

Db 301 KLIEFLSKFQNDRTEDEQFNDEKTYLVKQIRDLKRAA 337 



RESULT 6 
Q21643 

ID Q21643 PRELIMINARY; PRT; 636 AA. 

AC Q21643; 

DT Ol-NOV-1996 {TrEMBLrel . 01, Created) 

DT Ol-OCT-2001 (TrEMBLrel. 18, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Hypothetical 72.3 kDa protein. 

GN R02E12.2. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID=623 9; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bristol N2 ; 

RX MEDLINE=99069613; PubMed=9851916 ; 

RA None ; 

RT "Genome sequence of the nematode C. elegans : a platform for 

RT investigating biology. The C. elegans Sequencing Consortium."; 

RL Science 282:2012-2018(1998). 

RN [2] 

RP SEQUENCE FROM N.A. 



RC STRAIN=Bristol N2 ; 

RA Leimbach D. ; 

RT "The sequence of C. elegans cosmid R02E12."; 

RL Submitted (APR-1996) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Bristol N2 ; 

RA Waterston R. ; 

RT "Direct Submission."; 

RL Submitted {JUL-2001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; U53337; AAA96186.2; 

DR WormPep; R02E12.2; CE28410. 

DR InterPro; IPR004892; Mo25, 

DR Pfam; PF03204; Mo25; 1. 

KW Hypothetical protein, 

SQ SEQUENCE 636 AA; 72282 MW; 85D5853E9F0E3193 CRC64; 

Query Match 62.6%; Score 1066.5; DB 5; Length 636; 

Best Local Similarity 60.4%; Pred. No. 6.3e-67; 

Matches 212; Conservative 53; Mismatches 69;. Indels 17; Gaps 3 

Qy 2 KKMP-LFSKSHKNPAEIVKILKDNLAILEK QDKKTDKASEEVSKSLQ 47 

I II II lllhlhHI I Ihl III III :||||:: 

Db 258 KVMPLLFGKSHKSPADWKTLREVLTILDKLPPPKLDKDGNIQSDKKYDKALDEVSKNVA 317 

Qy 48 AMKEILCGTNEKEPPTE AVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNI 104 

H : I : II :| Mllllhh: H II I Ul MM IMIh 

Db 318 MIKSFIYGNDSAEPSSEHWQVAQLAQEVYNANILPMLIKMLPKPEFECKKDVGQIFNNL 377 

Qy 105 LRRQIGTRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQ 164 

MIMMMMIIh I I II hMI I III IhllM llh Mlllhh 

Db 378 LRRQIGTRSPTVEYLGARPEILIQLVQGYSVPDIALTCGLMLRESIRHDHLAKI ILYSDV 437 

Qy 165 FRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVT 224 

I II Ih llhlllhllhl MM ::hM: MM I h II hllll 
Db 438 FYTFFLYVQSEVFDISSDAFSTFKELTTRHKAIIAEFLDSNYDTFFAQYQNLLNSKNYVT 497 

Qy 225 KRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASP 284 

MMMIMIhllMM llllll IMhIII llllll MhMllllllllhl 

Db 4 98 RRQSLKLLGELLLDRHNFNTMTKYISNPDNLRLMMELLRDKSRNIQYEAFHVFKVFVANP 557 

Qy 285 HKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDLKKT 335 

■■\ --W Hi --h Ihllll I :||||||| III I I Mil::: I : 
Db 558 NKPKPISDILNRNREKLVEFLSEFHNDRTDDEQFNDEKAYLIKQIQEMKSS 608 



RESULT 7 
Q8K312 

ID Q8K312 PRELIMINARY; PRT; 205 AA. 

AC Q8K312; 

DT Ol-OCT-2002 {TrEMBLrel . 22, Created) 

DT Ol-OCT-2002 (TrEMBLrel, 22, Last sequence update) 
DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 
DE Similar to calcium binding protein, 39 kDa (Fragment) . 
OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 



ox NCBI_TaxID=10090 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Strausberg R. ; 

RL Submitted (MAY-2002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC029053/ AAH29053.1; -. 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF032 04; Mo25; 1. 

FT NON_TER 1 1 , 

SQ SEQUENCE 205 AA; 24582 MW; 015261A02F808169 CRC64 ; 

Query Match 51.3%; Score 875; DB 11; Length 205; 

Best Local Similarity 83.6%; Pred. No. 4.8e-54; 

Matches 168; Conseirvative 17; Mismatches 16; Indels 0; Gaps 0 
Qy 136 PQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHK 195 

hill MM III 1 1 Mill II llhl II llhllhlllllllllllllllllllll 

Db 1 PEIALNCGIMLRECIRHEPLAKIILWSEQFYDFFRYVEMSTFDIASDAFATFKDLLTRHK 60 

Qy 196 VLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFAIMTKYI SKPENL 255 

M hllllMI I MUM II Mill II II IIMI hill IIMMM lllllll 

Db 61 LLSAEFLEQHYDRFFSEYEKLLHSENYVTKRQSLKLLGELLLDRHNFTIMTKYISKPENL 120 

Qy 256 KLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDD 315 

Mlllllllll IIIMIIIMIIIIhhIlllhMIIIII IIMIM II Mhl 

Db 121 KLMMNLLRDKSRNIQFEAFHVFKVFVANPNKTQPILDILLKNQTKLIEFLSKFQNDRTED 180 

Qy 316 EQFADEKNYLIKQIRDLKKTA 336 

Ml III Ihllllllh I 
Db 181 EQFNDEKTYLVKQIRDLKRAA 201 



RESULT 8 
Q8H5L9 

ID Q8H5L9 PRELIMINARY; PRT; 333 AA. 

AC Q8H5L9; 

DT Ol-MAR-2003 (TrEMBLrel . 23, Created) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Putative M025 protein (CGI-66) . 

GN OJ1060_D03. 13 . 

OS Oryza sativa (japonica cultivar-group) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 

OC Ehrhartoideae; Oryzeae; Oryza. 

OX NCB I _Tax I D= 3 9 9 4 7 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Nipponbare; 

RA Sasaki T., Matsumoto T. , Yamamoto K. ; 

RT "Oryza sativa nipponbare (GA3) genomic DNA, chromosome 7, BAG 

RT clone :OJ1060_D03 . " ; 

RL Submitted (JUL-2001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AP003803; BAC22269.1; -. 

SQ SEQUENCE 333 AA; 38452 MW; CB6FC45E098C2401 CRC64 ; 



Query Match 



41.6%; Score 709.5; DB 10; Length 333; 



Best Local Similarity 44.0%; Pred. No, 3.7e-42; 

Matches 147; Conservative 67; Mismatches 109; Indels 11; Gaps 5 

Qy 6 LFSKSHKNPAEIVKILKDNLAILEKQ DKKTDKASEEVSKSLQAMKEILCGTNEK 59 

II : ll-h - I h II hlh- Hill I 

Db 4 LFKSKPRTPADWRQTRELLIFLDLHSGSRGGDAKREEKMAELSKNIRELKSILYGNGES 63 

Qy 60 EPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYI 119 

II III Mil: I II I I Ml Ih h hh - Ih 

Db 64 EPVTEACVQLTQEFFRENTLRLLIICLPKLNLETRKDATQWANLQRQQVSSKIVASEYL 123 

Qy 120 SAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDI 179 

h :| h II III I lllllllh :| :| h : || III 
Db 124 EANKDLLDTLI-SYENMDIALHYGSMLRECIRHQSIA-YVLESDHMKKFFDYIQLPNFDI 181 

Qy 18 0 ASDAFATFKDLLTRHKVLVADFLEQNYDTI FEDYE - KLLQSENYVTKRQSLKLLGELI LD 238 

MM lllhllllli Ihll MM I Ml I IIMIIhM l|::MI 

Db 182 ASDASATFKELLTRHKATVAEFLSKNYDWFFSEFNTRLLSSTNYITKRQAIKFLGDMLLD 241 

Qy 239 RHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQ 298 

I I M MM Ml :MIIIII I III IIIIIIIM h M M Ih h 

Db 242 RSNSTVMMRYVSSKDNLMILMNLLRDSSKNIQIEAFHVFKLFAANKNKPTEVVNILVTNR 301 

Qy 2 99 PKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDL 332 

Ih I : h - MM M MIM I 

Db 302 SKLLRFFAGFKIDK--DEQFEADKEQVIKEISAL 333 

RESULT 9 

Q8L9L9 

ID Q8L9L9 PRELIMINARY; PRT; 345 AA. 

AC Q8L9L9; 

DT Ol-OCT-2002 (TrEMBLrel . 22, Created) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE Hypothetical protein. 

OS Arabidopsis thaliana (Mouse-ear cress) . 

OC Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 

OC Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; Rosidae; 

OC eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

OX NCBI_TaxID=3702 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Haas B.J., Volfovsky N. , Town CD., Troukhan M., Alexandrov N. , 

RA Feldmann K.A, , Flavell R.B., White O. , Salzberg S,L.; 

RT "Full-length messenger RNA sequences greatly improve genome 

RT annotation. " ; 

RL Genome Biol. 0:0-0(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RA Brover V., Troukhan M. , Alexandrov N. , Lu Y.-P., Flavell R. , 

RA Feldmann K. ; 

RT "Full-Length cDNA from Arabidopsis thaliana."; 

RL Submitted (MAR-2 002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AY088359; AAM65898.1; 

DR InterPro; IPR004892; Mo25. 

DR Pfam; PF03204; Mo25; 1. 



KW Hypothetical protein. 

SQ SEQUENCE 345 AA; 39841 MW; 2C46A3D3DEBB47AA CRC64 ; 

Query Match 39.4%; Score 671.5; DB 10; Length 345; 

Best Local Similarity 42.9%; Pred, No. 1.8e-39; 

Matches 140; Conservative 68; Mismatches 113; Indels 5; Gaps 2 

Qy 12 KNPAEIVKILKDNLAILEKQD KKTDKASEEVSKSLQAMKEILCGTNEKEPPTEAVA 67 

I I hi! -hi h : I :|l III h II I I II : 

Db 12 KTPQEWKAIRDSLMALDTKTWEVKALEKALEEVEKNFSSLRGILSGDGETEPNADQAV 71 

Qy 68 QLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHILF 127 

III I :| I - :| :|h - ^h^H hi I U 

Db 72 QLALEFCKEDWSLVIHKLHILGWETRKDLLHCWSILLKQKVGDTYCCVQYFEEHFELLD 131 

Qy 128 MLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTFDIASDAFATF 187 

h h HII II lllllh III II I I llhlll Ihllllhll 
Db 132 SLWCYDNKEIALHCGSMLRECIKFPSLAKYILESACFELFFKFVELPNFDVASDAFSTF 191 

Qy 188 KDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHNFAIMTK 247 

llllhl :h:|l :| h Ihll I lllhlllllM : :h I li I 
Db 192 KDLLTKHDSWSEFLTSHYTEFFDVYERLLTSSNYVTRRQSLKLLSDFLLEPPNSHIMKK 251 

Qy 248 YISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSS 307 

:| : Ih^l Ihl I III llhlhllhhl I : II H Ihl I 
Db 252 FILEVRYLKVIMTLLKDSSKNIQISAFHIFKIFVANPNKPQEVKIILARNHEKLLELLHD 311 

Qy 308 FQKER-TDDEQFADEKNYLIKQIRDL 332 

: ::|:|| :|| :|::|: \ 
Db 312 LSPGKGSEDDQFEEEKELIIEEIQKL 337 

RESULT 10 
Q8LIF3 

ID Q8LIF3 PRELIMINARY; PRT; 322 AA. 

AC Q8LIF3; 

DT Ol-OCT-2002 (TrEMBLrel . 22, Created) 

DT Ol-OCT-2002 (TrEMBLrel . 22, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel . 23, Last annotation update) 

DE Hypothetical protein (P0503D09.26 protein). 

GN OJ1316_A04.9 OR P0503D09.26. 

OS Oryza sativa (japonica cult ivar -group) . 

OC Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 

OC Ehrhartoideae; Oryzeae; Oryza. 

OX NCBI_TaxID=39947; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Nipponbare; 

RA Sasaki T., Matsumoto T. , Yamamoto K. ; 

RT "Oryza sativa nipponbare (GA3) genomic DNA, chromosome 7, BAC 

RT clone :OJ1316_A04 . " ; 

RL Submitted (JUL-2001) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=cv. Nipponbare; 

RA Sasaki T., Matsumoto T. , Katayose Y. ; 



RT "Oryza sativa nipponbare {GA3) genomic DNA, chromosome 7, PAC 

RT clone :P0503D09. " ; 

RL Submitted {JUN-2002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AP003822; BAC06992.1; 

DR EMBL; AP005455; BAC16736.1; 

DR Gramene; Q8LIF3; 

DR InterPro; IPR004892; M025. 

DR Pfam; PF03204; Mo25; 2. 

KW Hypothetical protein. 

SQ SEQUENCE 322 AA; 37091 MW; 99434DFA7C2DCD21 CRC64 ; 

Query Match 34.6%; Score 590; DB 10; Length 322; 

Best Local Similarity 38.5%; Pred. No. 9e-34; 

Matches 129; Conservative 73; Mismatches 109; Indels 24; Gaps 4; 

Qy 4 MPLFSKSHKNPA EIVKILKDNLAILEKQDKKTD-KASEEVSKSLQAMKEILCGTN 57 

I I :: M |:|: H-l I I U II hi I- I I 

Db 1 MSFFFRAASRPARPSPQELVRSIKESLLAL---DTRTGAKALEDVEKNVSTLRQTLSGDG 57 

Qy 58 EKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVE 117 

I II I I hi h :| : : Hhlh - Hh- h 

Db 58 EVEPNQEQVLQIALEICKEDVLSLFVQNMPSLGWEGRKDLAHCWSILLRQKVDEAYCCVQ 117 

Qy 118 YISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFKYVELSTF 177 

II I :| h h ::|l II lllllh: III M h I I lU III II 

Db 118 YIENHFDLLDFLWCYKNLEVALNCGNMLRECIKYPTLAKYILESSSFELFFQYVELSNF 177 

Qy 178 DIASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELIL 237 

mill llllllhh h:|l -l- II I -W I lllhllhl I I :| 
Db 178 DIASDALNTFKDLLTKHEAAVSEFLCSHYEQFFELYTRLLTSTNYVTRRQSVKFLSEFLL 237 

Qy 238 DRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKN 297 

: I II :|l : I :h II llllhhl = h-h I 

Db 238 EAPNAQIMKRYIVEVSYLNIMIGLL KVFVANPNKPRDIIQVLVDN 282 

Qy 298 QPKLIEFLSSFQKERTDDEQFADEKNYLIKQIRDL 332 

:|:: | : : :||| :|:: :||:| | 

Db 283 HRELLKLLGNLPTSKGEDEQLEEERDLIIKEIEKL 317 



RESULT 11 
Q8K038 

ID Q8K038 PRELIMINARY; PRT; 103 AA. 

AC Q8K038; 

DT Ol-OCT-2002 (TrEMBLrel . 22, Created) 

DT Ol-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel, 23, Last annotation update) 

DE Similar to RIKEN cDNA 1500031K13 gene. 

OS Mus musculus (Mouse) . 

DC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; * 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-Kidney; 

RA Strausberg R. ; 

RL Submitted (JUL-2002) to the EMBL/GenBank/DDBJ databases. 



DR EMBL; BC034159; AAH34159.1; 

DR InterPro; I PRO 04 8 92; Mo25. 

DR Pfam; PF032 04; Mo25; 1. 

SQ SEQUENCE 103 AA; 11291 MW; EA86A9F6E9E426E0 CRC64; 

Query Match 25.5%; Score 435; DB 11; Length 103; 

Best Local Similarity 97.8%; Pred. No. 1.8e-23; 

Matches 89; Conservative 1; Mismatches 1; Indels 0; Gaps 0 
Qy 4 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPT 63 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIMIIhlllll 

Db 1 MPLFSKSHKNPAEIVKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNDKEPPT 60 

Qy 64 EAVAQLAQELYSSGLLVTLIADLQLIDFEGK 94 

lllllllllllllllllllillllllill I 
Db 61 EAVAQLAQELYSSGLLVTLIADLQLIDFEVK 91 



RESULT 12 

025188 

ID 025188 PRELIMINARY; PRT; 677 AA. 

AC 025188; 

DT 01 -JAN- 1998 (TrEMBLrel . 05, Created) 

DT Ol-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 

DE DNA topoisomerase I (TOPA) . 

GN HP0440. 

OS Helicobacter pylori (Campylobacter pylori) . 

OC Bacteria; Proteobacteria ; Epsilonproteobacteria; Campylobacterales ; 

OC Helicobacteraceae; Helicobacter. 

OX NCBI__TaxID=210; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=26695 / ATCC 700392; 

RX MEDLINE=97394467; PubMed=9252185 ; 

RA Tomb J.-F., White 0., Kerlavage A.R., Clayton R.A., Sutton G.G., 

RA Fleischmann R.D., Ketchum K.A, , Klenk H.-P., Gill S., Dougherty B.A., 

RA Nelson K. , Quackenbush J., Zhou L. , Kirkness E.F., Peterson S. , 

RA Loftus B., Richardson D. , Dodson R., Khalak H.G. , Glodek A., 

RA McKenney K. , FitzGerald L.M. , Lee N., Adams M.D., Hickey E.K., 

RA Berg D.E., Gocayne J.D., Utterback T.R., Peterson J.D. , Kelley J.M. , 

RA Cotton M.D., Weidman J.M. , Fujii C. , Bowman C. , Watthey L. , Wallin E., 

RA Hayes W.S., Borodovsky M., Karp P.D., Smith H.O., Eraser CM., 

RA Venter J. C. ; 

RT "The complete genome sequence of the gastric pathogen Helicobacter 

RT pylori . " ; 

RL Nature 388:539-547(1997). 

DR EMBL; AE000559; AAD07502.1; 

DR TIGR; HP0440; 

DR InterPro; IPR003601; DNAtopI_ATP__bind . 

DR InterPro; IPR003602; DNAtopI_DNA_bind . 

DR InterPro; IPR000380; DNA_tpisomrase . 

DR InterPro; IPR006171; Toprim_dom. 

DR InterPro; IPR006154; Toprim_sub. 

DR Pfam; PF01131; Topoisom_bac; 1. 

DR Pfam; PF01751; Toprim; 1. 

DR PRINTS; PR00417; PRTPISMRASEI . 



DR SIVIART; SM00437; TOPIAC; 1. 

DR SMART; SM00436; TOPlBc; 1. 

DR SMART; SM004 93; TOPRIM; 1. 

KW Hypothetical protein; Isomerase; Complete proteome. 

SQ SEQUENCE 677 AA; 77677 MW; 4B285B8 1F1092BB4 CRC64 ; 



Query Match 7.9%; Score 134.5; DB 16; Length 677; 

Best Local Similarity 21.6%; Pred. No. 0.24; 

Matches 88; Conservative 58; Mismatches 134; Indels 127; Gaps 16; 

Qy 7 FSKSHKNPA-EIVKILKDNL AILEKQDKK TDKASEEVSKSLQAMKE 51 

I II I : :| III I h II I I : lllh 

Db 222 FKFKDKNEASQFLKDLKDGLGSMSVLVSLKESLSNKEPKKPFTTSKLLSQASKSLKI--- 278 

Qy 52 ILCGTNEKEPPTEAVAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGT 111 

Ih :|||||:|: :|h I = : I |: I I 
Db 279 PTKEIAQLAQKLFEAGLITYHRTDSEFLSPEYLKEHEVFFEPIY 322 

Qy 112 RSPTV EYIS -AHPHILFMLLKGYEAPQIALRCGIMLRECIRHE 153 

hi II : III I I I :|: : I : I 

Db 323 --PSVYQYREYKAGKNSQAEAHEAIRITHPHALKDLEKVCSDAKISEELALKLYQLIYTN 380 

Qy 154 PL---AKIILFSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQNYDTIF 210 

: h Ih II I l-l II I : I I : I 

Db 381 TICSQSRNALY-NQYDCIFK IKSESFKLSFKLLKEKGFLEIEELIQGKEEIN 431 

Qy 211 EDYEKLLQSENYVTKRQSLKLLGELILDRHNFAIMTKYISKPENLKLMMNLLRDKSPNIQ 270 

: |: : ||: I | |: : : : | || : || : 
Db 432 RE-EQESEIENFSLKENDSVPLKEVFIKK lEKPSPKPYKESAFIPLLESEG 481 

Qy 271 FEAFHVFKVFVASPHKTQPIVEILLKNQ PKLIEFLSSFQKERTDD- 315 

: I :::||| : : :| :| |:|:: | 

Db 482 IGRPSTYASFLDLLLKRKYISIDTKTNAITPTSQGLEVISFFKKDKEVDF 531 

Qy 316 EQF ADEKNYLIKQIRDLKKTA 336 

:|| I : I II II 

Db 532 lALTSKDKSKLGNTTKQFEECLDLIMRGEASYEKFMLEVISKLKSTA 578 



RESULT 13 

026049 

ID 026049 PRELIMINARY; PRT; 430 AA. 

AC 026049; 

DT Ol-JAN-1998 (TrEMBLrel . 05, Created) 

DT Ol-JAN-1998 (TrEMBLrel. 05, Last sequence update) 

DT Ol-MAR-2002 (TrEMBLrel. 20, Last annotation update) 

DE Hypothetical protein HP1520. 

GN HP1520. 

OS Helicobacter pylori (Campylobacter pylori) . 

OC Bacteria ; Proteobacteria ; Epsilonproteobacteria ; Campylobacterales ; 

OC Helicobacteraceae; Helicobacter. 

OX NCBI_TaxID=210; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=26695 / ATCC 700392; 

RX MEDLINE=97394467; PubMed=9252185 ; 

RA Tomb J.-F,, White O. , Kerlavage A.R., Clayton R.A., Sutton G.G., 



RA Fleischmann R.D., Ketchum K.A. , Klenk H.-P., Gill S., Dougherty B,A. , 

RA Nelson K. , Quackenbush J., Zhou L. , Kirkness E.F., Peterson S., 

RA Loftus B., Richardson D. , Dodson R. , Khalak H.G., Glodek A., 

RA McKenney K. , FitzGerald L.M. , Lee N. , Adams M.D. , Hickey E.K., 

RA Berg D.E,, Gocayne Utterback T.R., Peterson J.D., Kelley J.M. , 

RA Cotton M.D., Weidman J.M., Fujii C. , Bowman C, Watthey L. , Wallin E. , 

RA Hayes W.S., Borodovsky M. , Karp P.D., Smith H,0., Fraser CM., 

RA Venter J.C. ; 

RT "The complete genome sequence of the gastric pathogen Helicobacter 

RT pylori."; 

RL Nature 388:539-547(1997). 

DR EMBL; AE000650; AAD08565.1; -. 

DR TIGR; HP1520; -. 

KW Hypothetical protein; Complete proteome. 

SQ SEQUENCE 430 AA; 50573 MW; 23DC6FE5E956B629 CRC64 ; 

Query Match 7.5%; Score 128; DB 16; Length 430; 

Best Local Similarity 20.9%; Pred. No. 0.39; 

Matches 82; Conservative 73; Mismatches 135; Indels 102; Gaps 20; 

Qy 7 FSKSHKNPAEI VKILKDNLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPP 62 

I : |: II |1||:|:|: |: : :|h : I - h 
Db 60 FYPNRKSKIEIEFNGEKILKENVAVFHSYDE--EFSSEDSVTTFMAKSDL KQQY 111 

Qy 63 TEAVAQLAQELYSSGLLVTL--IA DLQLIDFEGKKDVTQIFNNILR 106 

: :| :| II \ II I I I U H I 

Db 112 DNILLELEKE--KKALLKSLRDIASGFDYEEEIKTIKNEKNKSFYEILDNHLTEIESSEK 169 

Qy 107 RQIGTRSPTV-EYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHEPLAKII 159 

II I I :::: I :: |: | |:: 

Db 170 HYSFKYRDIFDGSKKVKDFVNKHHDLIEQYFNKYQ ELLSQSK 211 

Qy 160 LF SNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVLVADFLEQ 204 

Db 212 IFKHMNSGDFGTNHADDLKKALENNRFFKANHSLKIAGEEITNYQKL-SDIFENEKNRIL 270 

Qy 205 NYDTIFEDYEKLLQSENYVTKRQSLKLLGELI LDRHNF- -AIMTKYISKP 252 

I : : I ::t: | : : || : | || :| :: |: : 

Db 271 NNEELKESFDKI---EKVINANKELKAFICDAISKDNTLLTEFLDYDSFRKKVLFSYLKQV 327 

Qy 253 -ENLKLMMNLLRDKSPNIQFEAFHVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKE 311 

:|:| ::|| |:|| h : I : : -11 II |: I I : 

Db 328 IQNVKSLVNLYREKKPEIE EIIKQASKDQKEWESVIEIF- -NQRFLVPFKVELQNQ 381 

Qy 312 R TDDEQ FADEKNYLIKQIRDLKK 334 

I I hh : I Ihl 

Db 382 KDILLNKDAAQFRFIFSDDNQDMNVQKEDLQK 413 



RESULT 14 
Q9WXU3 

ID Q9WXU3 PRELIMINARY; PRT; 1285 AA. 

AC Q9WXU3 ; 

DT Ol-NOV-1999 (TrEMBLrel . 12, Created) 
DT 01~NOV-1999 (TrEMBLrel. 12, Last sequence update) 
DT Ol-MAR-2003 (TrEMBLrel. 23, Last annotation update) 
DE COME protein, putative. 



GN TM0088. 

OS Thermotoga maritima. 

OC Bacteria; Thermotogae; Thermotogales ; Thermotogaceae; Thermotoga. 

OX NCB I _Tax I D= 2 3 3 6 ; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=MSB8 / DSM 3109; 

RX MEDLINE=99287316; PubMed=10360571 ; 

RA Nelson Clayton R.A. , Gill S.R., Gwinn M.L., Dodson R.J., 

RA Haft D.H., Hickey E.K., Peterson J.D., Nelson W.C., Ketchum K.A. , 

RA McDonald L. , Utterback T.R,, Malek J.A. / Linher K.D., Garrett M.M., 

RA Stewart A.M., Cotton M.D., Pratt M.S., Phillips C.A. , Richardson D. , 

RA Heidelberg J., Sutton G.G. , Fleischmann R,D., Eisen J.A. , White O., 

RA Salzberg S.L., Smith H.O., Venter J.C., Fraser CM.; 

RT "Evidence for lateral gene transfer between Archaea and Bacteria from 

RT genome sequence of Thermotoga maritima."; 

RL Nature 399:323-329(1999). 

DR EMBL; AE001695; AAD35182.1; 

DR TIGR; TM0088; 

DR InterPro; IPR004846; GSPIl/lIIprotein. 

DR InterPro; IPR001993; Mitoch_carrier . 

DR Pfam; PF00263; GSPII_III; 1. 

DR PROSITE; PS00215; MITOCH_CARRIER; 1. 

KW Complete proteome. 

SQ SEQUENCE 1285 AA; 145209 MW; 057435F821FB0EA5 CRC64 ; 

Query Match 7.2%; Score 123.5; DB 16; Length 1285; 

Best Local Similarity 21.5%; Pred. No. 3; 

Matches 86; Conservative 78; Mismatches 129; Indels 107; Gaps 23 

Qy 1 MKKMPLFSKSHKNPAEIVKILKDNLAILEKQD KKT DKASEEV SKS 45 

H I I :| h : I h III III I II 

Db 556 LKVAMLSGKEEEN VQKAAEELQI I SSEERI I RFVKKTENVPI DKAKNWLQLYSVS 611 

Qy 46 LQAMKEILCGTNEKEPPTEAVAQLAQELYSSGL LVTLIAD-- 85 

:: : | |:| | | | h-ll : h - : 

Db 612 lEELGNELWIGERE-EVEKAADLLQKIFSSEVEISRDFVKLPSWIDEQEKLLEWKNSA 670 

Qy 86 LQLID FEGKKD VTQIFNNILRRQIG- -TRSPTVEYI SAHPHILFML 129 

-:| III h ::|::|: : :| : 111- h I h 

Db 671 GITYEILDGWYFEGTKENVEKAKELFSDIVEK-LGEVRKEETVEFLEVNSSFPVDEFIN 729 

Qy 130 LKGYEAPQIALRCGIMLRECIRHEPLAKIIL FSNQFRDFF KYVELST 176 

III: I : I ::| |: :| || | |: : 

Db 73 0 LSGKLYPDVT CFSLDQLGLLVLKGSSEAVEDLSSMYRSFFERHQKIVKENV 78 0 

Qy 177 FD lASDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLG 233 

II : : :h: I UN : : - I M I I- : :| I 

Db 781 FDRLMLEVPSGFSFEEFKTFLEVLVPEVKQ WYLDKLNLLLVEVPVSQSERVKSLL 836 

Qy 234 ELILDRHNFAIMTKYIS KPENL-KLMMNLLRDKSPNIQFEAF-HVFKVFVAS 283 

\ \ h I : III |:: I :: :: I 

Db 837 DTFLKKEEAVSEKKAVKSVTIPSGVNPDELSSYLKKLLR NVEITVFPNMGQMIVEG 892 

Qy 284 P-HKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEK 322 

I : II- = I- III I : =1 I 

Db 893 PENEVEKAVELVEAEKEKIV LKERKDYVKVSDGK 926 



RESULT 15 
Q58914 



ID Q58914 PRELIMINARY; PRT; 1175 AA. 

AC Q58914; 

DT Ol-JUN-1998 (TrEMBLrel . 06, Created) 

DT Ol-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT Ol-MAR-2003 {TrEMBLrel. 23, Last annotation update) 

DE Hypothetical protein MJ1519. 

GN MJ1519. 

OS Methanococcus jannaschii. 

OC Archaea; Euryarchaeota; Methanococci ; Methanococcales ; 

OC Methanocaldococcaceae; Methanocaldococcus . 

OX NCBI_TaxID=2 190; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=JAL-1 / DSM 2661 / ATCC 43067; 

RX MEDLINE=96337999; PubMed=8688087 ; 

RA Bult C.J., White 0., Olsen G.J., Zhou L. , Fleischmann R.D., 

RA Sutton G.G., Blake J. A., FitzGerald L.M., Clayton R.A. , Gocayne J.D., 

RA Kerlavage A.R., Dougherty B.A., Tomb J.-F., Adams M.D., Reich C.I., 

RA Overbeek R. , Kirkness E.F., Weinstock K.G. , Merrick J.M. , Glodek A., 

RA Scott J.L., Geoghagen N.S.M., Weidman J.F,, Fuhrmann J.L., Nguyen D., 

RA Utterback T.R., Kelley Peterson J.D,, Sadow P.W., Hanna M.C., 

RA Cotton M.D., Roberts K.M., Hurst M.A. , Kaine B.P., Borodovsky M. , 

RA Klenk H.-P., Fraser CM., Smith H.O. , Woese C.R., Venter J.C.; 

RT "Complete genome sequence of the methanogenic archaeon, Methanococcus 

RT jannaschii."; 

RL Science 273:1058-1073(1996). 

DR EMBL; U67593; AAB99538.1; 

DR TIGR; MJ1519; -. 

DR Inter Pro; I PRO 03 5 93; AAA_ATPase. 

DR SMART; SM00382; AAA; 1. 

KW Hypothetical protein; Complete proteome. 

SQ SEQUENCE 1175 AA; 138618 MW; 99082EA5A4D11140 CRC64 ; 

Query Match 7.0%; Score 12 0; DB 17; Length 1175; 

Best Local Similarity 21.5%; Pred. No. 4.8; 

Matches 76; Conservative 58; Mismatches 131; Indels 88; Gaps 15 

Qy 7 FSKSHKNPAEIVKILKD-NLAILEKQDKKTDKASEEVSKSLQAMKEILCGTNEKEPPTEA 65 

hi : : I I I hi II |: :| : I : :: |: | | 

Db 232 FNKFREENQDFDKYLTDENIAFRPHVMKKFDEFAENIKKVIAELE GSKYKYPGLPG 287 

Qy 66 VAQLAQELYSSGLLVTLIADLQLIDFEGKKDVTQIFNNILRRQIGTRSPTVEYISAHPHI 125 

I II h ::| Ihl :::| - : I :|: 
Db 288 V LYFLGMEDAYSRYIELWKNEGEKGEEKLYNALI-ESLENRKENLEF 333 

Qy 126 LFMLLKGYEAPQIALRCGIMLRECIRHEPLAKIILFSNQFRDFFK YVELSTFDIA- 18 0 

I : : II :||:| I I III I : 

Db 334 GITKKVIDKFIAQKEEFREFLKNYAVYYELSAFKLEK 370 

Qy 181 SDAFATFKDLLTRHKVLVADFLEQNYDTIFEDYEKLLQSENYVTKRQSL 229 

I Ul I l-l : H-: I \ I 

Db 371 IKEQYEKEFINLDNIIKNPYILVED-LKEN DSFERI IFEELDSWERRRLGDKFNP 424 



Qy 230 KLLGELILDRH NFAIMTKYISKPENLKLMMNLLRDKSPNIQFEAF 274 

II I II II II II :|| : I h I 

Db 425 YSPYRVRALLVE-ILKRHLSSGNTTISTK DLKDFFEKMDKDIVKITFDEFLRI I 477 

Qy 275 HVFKVFVASPHKTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIK 327 

:| :: | : : : : | : | | | : : : | : | : |||:| 
Db 478 EEYKDIIS--EKVEIVKKEVKNNENKEIIELFTLKEIREyEEIIENTINYLLK 528 



Search completed: January 7, 2004, 16:48:05 
Job time : 56 sees 



